On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions

Simon Martin; Francis Bach; Giulio Biroli

Pré-Publication, Document De Travail (Preprint/Prepublication) Année : 2023

On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions

(1, 2, 3) , (1, 3) , (4)

1
2
3
4

Simon Martin

Fonction : Auteur
PersonId : 1303186

Département d'informatique - ENS Paris

Laboratoire de physique de l'ENS - ENS Paris

Statistical Machine Learning and Parsimony

Francis Bach

Fonction : Auteur
PersonId : 863086

Département d'informatique - ENS Paris

Statistical Machine Learning and Parsimony

Giulio Biroli

Fonction : Auteur
PersonId : 1303187

Systèmes Désordonnés et Applications

Résumé

We study the training dynamics of a shallow neural network with quadratic activation functions and quadratic cost in a teacher-student setup. In line with previous works on the same neural architecture, the optimization is performed following the gradient flow on the population risk, where the average over data points is replaced by the expectation over their distribution, assumed to be Gaussian. We first derive convergence properties for the gradient flow and quantify the overparameterization that is necessary to achieve a strong signal recovery. Then, assuming that the teachers and the students at initialization form independent orthonormal families, we derive a high-dimensional limit for the flow and show that the minimal overparameterization is sufficient for strong recovery. We verify by numerical experiments that these results hold for more general initializations.

Mots clés

High-dimensional statistics Optimization Machine learning

Domaines

Machine Learning [stat.ML] Optimisation et contrôle [math.OC] Systèmes désordonnés et réseaux de neurones [cond-mat.dis-nn]

Fichier principal

main.pdf (539.39 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Simon Martin : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04270390

Soumis le : lundi 6 novembre 2023-11:43:13

Dernière modification le : vendredi 19 avril 2024-16:18:56

Dates et versions

hal-04270390 , version 1 (06-11-2023)

Identifiants

HAL Id : hal-04270390 , version 1
ARXIV : 2311.03794

Citer

Simon Martin, Francis Bach, Giulio Biroli. On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions. 2023. ⟨hal-04270390⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL SORBONNE-UNIVERSITE LPENS UP-SCIENCES ANR PRAIRIE-IA

67 Consultations

21 Téléchargements

On the Impact of Overparameterization on the Training of a Shallow Neural Network in High Dimensions

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager