Title: On estimating generalization gaps via the functional variance in overparameterized models
Authors: Keisuke Yano - The Institute of Statistical Mathematics (Japan) [presenting]
Akifumi Okuno - The Institute of Statistical Mathematics (Japan)
Abstract: The focus is on a generalization gap estimation for overparameterized models such as deep neural networks. We show that the functional variance, a key concept in defining a widely-applicable information criterion, well approximates the difference between the generalization error and an empirical error in overparameterized linear regression models. Overparametrerized linear regression models arise by considering regimes where deep neural networks can be well-approximated by linear functions of their parameters; for example, the neural tangent kernel regime. For practical implementation, we propose a Langevin approximation of the functional variance, which can be implemented consistently with gradient-based optimization algorithms and leverages only the first-order gradient of a loss function. Through numerical experiments, we demonstrate the applicability of the Langevin functional variance for overparameterized models.