Skip to Main content Skip to Navigation
Conference papers

Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling

Dmitry Babichev 1, 2 Francis Bach 2, 1
1 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
Abstract : Stochastic gradient methods enable learning probabilistic models from large amounts of data. While large step-sizes (learning rates) have shown to be best for least-squares (e.g., Gaussian noise) once combined with parameter averaging, these are not leading to con-vergent algorithms in general. In this paper , we consider generalized linear models, that is, conditional models based on exponential families. We propose averaging moment parameters instead of natural parameters for constant-step-size stochastic gradient descent. For finite-dimensional models, we show that this can sometimes (and surprisingly) lead to better predictions than the best linear model. For infinite-dimensional models, we show that it always converges to optimal predictions, while averaging natural parameters never does. We illustrate our findings with simulations on synthetic data and classical benchmarks with many observations.
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download

https://hal.inria.fr/hal-01929810
Contributor : Dmitry Babichev <>
Submitted on : Wednesday, November 21, 2018 - 2:14:39 PM
Last modification on : Tuesday, May 4, 2021 - 2:06:02 PM
Long-term archiving on: : Friday, February 22, 2019 - 2:12:46 PM

File

Averaging_predictions.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01929810, version 1
  • ARXIV : 1804.05567

Collections

Citation

Dmitry Babichev, Francis Bach. Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling. UAI 2018 - Conference on Uncertainty in Artificial Intelligence, Aug 2018, Monterey, United States. ⟨hal-01929810⟩

Share

Metrics

Record views

92

Files downloads

63