Minimizing Finite Sums with the Stochastic Average Gradient

Mark Schmidt; Nicolas Le Roux; Francis Bach

Pré-Publication, Document De Travail Année : 2013

Minimizing Finite Sums with the Stochastic Average Gradient

(1, 2) , (1, 2) , (1, 2)

1
2

Mark Schmidt

Fonction : Auteur
PersonId : 908887

Statistical Machine Learning and Parsimony

Laboratoire d'informatique de l'école normale supérieure

Nicolas Le Roux

Fonction : Auteur
PersonId : 905842

Statistical Machine Learning and Parsimony

Laboratoire d'informatique de l'école normale supérieure

Francis Bach

Fonction : Auteur
PersonId : 863126

Statistical Machine Learning and Parsimony

Laboratoire d'informatique de l'école normale supérieure

Résumé

We propose the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from O(1/k^{1/2}) to O(1/k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1/k) to a linear convergence rate of the form O(p^k) for p < 1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

Mots clés

Convex Optimization Stochastic Gradient Convergence Rates

Domaines

Optimisation et contrôle [math.OC] Apprentissage [cs.LG] Machine Learning [stat.ML] Calcul [stat.CO]

Fichier principal

sag_journal.pdf (777.99 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Mark Schmidt : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00860051

Soumis le : mardi 10 septembre 2013-01:00:36

Dernière modification le : vendredi 19 avril 2024-16:18:55

Archivage à long terme le : jeudi 6 avril 2017-16:55:03

Dates et versions

hal-00860051 , version 1 (10-09-2013)

hal-00860051 , version 2 (10-05-2016)

Identifiants

HAL Id : hal-00860051 , version 1
ARXIV : 1309.2388

Citer

Mark Schmidt, Nicolas Le Roux, Francis Bach. Minimizing Finite Sums with the Stochastic Average Gradient. 2013. ⟨hal-00860051v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

4085 Consultations

15673 Téléchargements

Minimizing Finite Sums with the Stochastic Average Gradient

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager