Convergence beyond the over-parameterized regime using Rayleigh quotients

David Robin; Kevin Scaman; Marc Lelarge

Communication Dans Un Congrès Année : 2022

Convergence beyond the over-parameterized regime using Rayleigh quotients

(1) , (1) , (1)

David Robin

Fonction : Auteur
PersonId : 1216866
IdHAL : robindar

Dynamics of Geometric Networks

Kevin Scaman

Fonction : Auteur
PersonId : 1062981

Dynamics of Geometric Networks

Marc Lelarge

Fonction : Auteur
PersonId : 833445

Dynamics of Geometric Networks

Résumé

In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Łojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.

Domaines

Machine Learning [stat.ML] Optimisation et contrôle [math.OC]

Fichier principal

3433_convergence_beyond_the_over_pa.pdf (2.22 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Kevin Scaman : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03896153

Soumis le : mardi 13 décembre 2022-11:44:06

Dernière modification le : lundi 11 décembre 2023-11:31:27

Archivage à long terme le : mardi 14 mars 2023-18:53:04

Dates et versions

hal-03896153 , version 1 (13-12-2022)

Identifiants

HAL Id : hal-03896153 , version 1

Citer

David Robin, Kevin Scaman, Marc Lelarge. Convergence beyond the over-parameterized regime using Rayleigh quotients. NeurIPS 2022 - 36th Conference on Neural Information Processing System, Nov 2022, New Orleans, United States. ⟨hal-03896153⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INSMI INRIA2 TDS-MACS PSL ANR PRAIRIE-IA

35 Consultations

55 Téléchargements

Convergence beyond the over-parameterized regime using Rayleigh quotients

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager