On the almost sure convergence of stochastic gradient descent in non-convex problems

Panayotis Mertikopoulos; Nadav Hallak; Ali Kavis; Volkan Cevher

Communication Dans Un Congrès Année : 2020

On the almost sure convergence of stochastic gradient descent in non-convex problems

(1, 2) , (3) , (3) , (3)

1
2
3

Panayotis Mertikopoulos

Fonction : Auteur
PersonId : 1933
IdHAL : mertikop
ORCID : 0000-0003-2026-9616
IdRef : 253119758

Performance analysis and optimization of LARge Infrastructures and Systems

Criteo AI Lab

Nadav Hallak

Fonction : Auteur
PersonId : 1084701

Ecole Polytechnique Fédérale de Lausanne

Ali Kavis

Fonction : Auteur
PersonId : 1084702

Ecole Polytechnique Fédérale de Lausanne

Volkan Cevher

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Résumé

This paper analyzes the trajectories of stochastic gradient descent (SGD) to help understand the algorithm's convergence properties in non-convex problems. We first show that the sequence of iterates generated by SGD remains bounded and converges with probability 1 under a very broad range of step-size schedules. Subsequently, going beyond existing positive probability guarantees, we show that SGD avoids strict saddle points/manifolds with probability 1 for the entire spectrum of step-size policies considered. Finally, we prove that the algorithm's rate of convergence to Hurwicz minimizers is O(1/n p) if the method is employed with a Θ(1/n p) step-size. This provides an important guideline for tuning the algorithm's step-size as it suggests that a cool-down phase with a vanishing step-size could lead to faster convergence; we demonstrate this heuristic using ResNet architectures on CIFAR.

Mots clés

2020 Mathematics Subject Classification. Primary 90C26 62L20 secondary 90C30 90C15 37N40 Non-convex optimization stochastic gradient descent stochastic approximation

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

NonConvexSGD-NIPS.pdf (2.64 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Panayotis Mertikopoulos : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03043771

Soumis le : lundi 7 décembre 2020-14:19:44

Dernière modification le : vendredi 5 avril 2024-03:09:47

Archivage à long terme le : lundi 8 mars 2021-19:10:07

Dates et versions

hal-03043771 , version 1 (07-12-2020)

Identifiants

HAL Id : hal-03043771 , version 1

Citer

Panayotis Mertikopoulos, Nadav Hallak, Ali Kavis, Volkan Cevher. On the almost sure convergence of stochastic gradient descent in non-convex problems. NeurIPS 2020 - 34th International Conference on Neural Information Processing Systems, Dec 2020, Vancouver, Canada. pp.1-32. ⟨hal-03043771⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_SRCPR PERSYVAL-LAB INRIA2 INRIA-EPFL TDS-MACS LIG-SRCPR-POLARIS MIAI ANR LIG_SIDCH

159 Consultations

348 Téléchargements

On the almost sure convergence of stochastic gradient descent in non-convex problems

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager