Variance Reduced Stochastic Gradient Descent with Neighbors

Thomas Hofmann; Aurelien Lucchi; Simon Lacoste-Julien; Brian Mcwilliams

Communication Dans Un Congrès Année : 2015

Variance Reduced Stochastic Gradient Descent with Neighbors

(1) , (1) , (2, 3, 4) , (1)

1
2
3
4

Thomas Hofmann

Fonction : Auteur

Department of Computer Science [ETH Zürich]

Aurelien Lucchi

Fonction : Auteur

Department of Computer Science [ETH Zürich]

Simon Lacoste-Julien

Fonction : Auteur
PersonId : 1938
IdHAL : simon-lacoste-julien
ORCID : 0000-0001-6485-6180
IdRef : 22557781X

Statistical Machine Learning and Parsimony

Département d'informatique - ENS Paris

Microsoft Research - Inria Joint Centre

Brian Mcwilliams

Fonction : Auteur

Department of Computer Science [ETH Zürich]

Résumé

Stochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its slow convergence can be a computational bottleneck. Variance reduction techniques such as SAG, SVRG and SAGA have been proposed to overcome this weakness, achieving linear convergence. However, these methods are either based on computations of full gradients at pivot points, or on keeping per data point corrections in memory. Therefore speed-ups relative to SGD may need a minimal number of epochs in order to materialize. This paper investigates algorithms that can exploit neighborhood structure in the training data to share and re-use information about past stochastic gradients across data points, which offers advantages in the transient optimization phase. As a side-product we provide a unified convergence analysis for a family of variance reduction algorithms, which we call memorization algorithms. We provide experimental results supporting our theory.

Domaines

Apprentissage [cs.LG] Optimisation et contrôle [math.OC] Machine Learning [stat.ML]

Simon Lacoste-Julien : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01248672

Soumis le : lundi 28 décembre 2015-04:30:01

Dernière modification le : vendredi 19 avril 2024-16:18:56

Dates et versions

hal-01248672 , version 1 (28-12-2015)

Identifiants

HAL Id : hal-01248672 , version 1
ARXIV : 1506.03662

Citer

Thomas Hofmann, Aurelien Lucchi, Simon Lacoste-Julien, Brian Mcwilliams. Variance Reduced Stochastic Gradient Descent with Neighbors. NIPS 2015 - Advances in Neural Information Processing Systems 28, Dec 2015, Montreal, Canada. ⟨hal-01248672⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL

250 Consultations

0 Téléchargements

Variance Reduced Stochastic Gradient Descent with Neighbors

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager