Minimizing Finite Sums with the Stochastic Average Gradient

Mark Schmidt; Nicolas Le Roux; Francis Bach

doi:10.1007/s10107-016-1030-6

Article Dans Une Revue Mathematical Programming Année : 2017

Minimizing Finite Sums with the Stochastic Average Gradient

(1, 2) , (2, 1) , (2, 1)

1
2

Mark Schmidt

Fonction : Auteur
PersonId : 908887

Statistical Machine Learning and Parsimony

Laboratoire d'informatique de l'école normale supérieure

Nicolas Le Roux

Fonction : Auteur
PersonId : 905842

Laboratoire d'informatique de l'école normale supérieure

Statistical Machine Learning and Parsimony

Francis Bach

Fonction : Auteur
PersonId : 863126

Laboratoire d'informatique de l'école normale supérieure

Statistical Machine Learning and Parsimony

Résumé

We propose the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from O(1/k^{1/2}) to O(1/k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1/k) to a linear convergence rate of the form O(p^k) for p < 1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

Mots clés

Convergence Rates Convex Optimization Stochastic Gradient

Domaines

Optimisation et contrôle [math.OC] Apprentissage [cs.LG] Machine Learning [stat.ML] Calcul [stat.CO]

Fichier principal

sagMP.pdf (926.31 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Mark Schmidt : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00860051

Soumis le : mardi 10 mai 2016-22:28:36

Dernière modification le : vendredi 19 avril 2024-16:18:55

Dates et versions

hal-00860051 , version 1 (10-09-2013)

hal-00860051 , version 2 (10-05-2016)

Identifiants

HAL Id : hal-00860051 , version 2
ARXIV : 1309.2388
DOI : 10.1007/s10107-016-1030-6

Citer

Mark Schmidt, Nicolas Le Roux, Francis Bach. Minimizing Finite Sums with the Stochastic Average Gradient. Mathematical Programming, 2017, 162 (1-2), pp.83-112. ⟨10.1007/s10107-016-1030-6⟩. ⟨hal-00860051v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL

4069 Consultations

15571 Téléchargements

Minimizing Finite Sums with the Stochastic Average Gradient

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager