Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

Liam Hodgkinson; Umut Şimşekli; Rajiv Khanna; Michael W. Mahoney

Communication Dans Un Congrès Année : 2022

Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

(1) , (2) , (3) , (1)

1
2
3

Liam Hodgkinson

Fonction : Auteur

Department of Statistics [Berkeley]

Umut Şimşekli

Fonction : Auteur
PersonId : 6757
IdHAL : umut-simsekli
IdRef : 250884003

Statistical Machine Learning and Parsimony

Rajiv Khanna

Fonction : Auteur

Department of Computer Science [Purdue]

Michael W. Mahoney

Fonction : Auteur

Department of Statistics [Berkeley]

Résumé

Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms and their dynamics on generalization performance in realistic non-convex settings is still poorly understood. While recent work has revealed connections between generalization and heavy-tailed behavior in stochastic optimization, this work mainly relied on continuous-time approximations; and a rigorous treatment for the original discrete-time iterations is yet to be performed. To bridge this gap, we present novel bounds linking generalization to the lower tail exponent of the transition kernel associated with the optimizer around a local minimum, in both discrete- and continuous-time settings. To achieve this, we first prove a data- and algorithm-dependent generalization bound in terms of the celebrated Fernique-Talagrand functional applied to the trajectory of the optimizer. Then, we specialize this result by exploiting the Markovian structure of stochastic optimizers, and derive bounds in terms of their (data-dependent) transition kernels. We support our theory with empirical results from a variety of neural networks, showing correlations between generalization error and lower tail exponents.

Domaines

Statistiques [stat]

Umut Şimşekli : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03935798

Soumis le : jeudi 12 janvier 2023-11:58:55

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-03935798 , version 1 (12-01-2023)

Identifiants

HAL Id : hal-03935798 , version 1
ARXIV : 2108.00781

Citer

Liam Hodgkinson, Umut Şimşekli, Rajiv Khanna, Michael W. Mahoney. Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers. ICML 2022 - 39th International Conference on Machine Learning, Jul 2022, Baltimore, United States. ⟨hal-03935798⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL ANR PRAIRIE-IA

8 Consultations

0 Téléchargements

Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager