D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, 1989.

R. Collobert, S. Bengio, and Y. Bengio, A Parallel Mixture of SVMs for Very Large Scale Problems, Neural Computation, vol.20, issue.5, pp.1105-1114, 2002.
DOI : 10.1162/neco.1991.3.1.79

C. De-sa, C. Zhang, K. Olukotun, and C. Ré, Taming the wild: a unified analysis of Hogwild!-style algorithms, NIPS, 2015.

A. Defazio, F. Bach, and S. Lacoste-julien, SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, NIPS, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01016843

J. C. Duchi, S. Chaturapruek, and C. Ré, Asynchronous stochastic convex optimization, NIPS, 2015.

T. Hofmann, A. Lucchi, S. Lacoste-julien, and B. Mcwilliams, Variance reduced stochastic gradient descent with neighbors, NIPS, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01248672

C. Hsieh, H. Yu, and I. Dhillon, PASSCoDe: Parallel ASynchronous Stochastic dual Co-ordinate Descent, ICML, 2015.

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, NIPS, 2013.

J. Konecny and P. Richtarik, Semi-stochastic gradient descent methods, 2013.

L. Roux, M. Schmidt, and F. Bach, A stochastic gradient method with an exponential convergence rate for finite training sets, NIPS, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00674995

D. D. Lewis, Y. Yang, T. G. Rose, and F. Li, RCV1: A new benchmark collection for text categorization research, JMLR, vol.5, pp.361-397, 2004.

X. Lian, Y. Huang, Y. Li, and J. Liu, Asynchronous parallel stochastic gradient for nonconvex optimization, NIPS, 2015.

J. Liu, S. J. Wright, C. Ré, V. Bittorf, and S. Sridhar, An asynchronous parallel stochastic coordinate descent algorithm, JMLR, vol.16, pp.285-322, 2015.

C. Ma, V. Smith, M. I. Jaggi, P. Jordan, M. Richtarik et al., Adding vs. averaging in distributed primal-dual optimization, ICML, 2015.

J. Ma, L. K. Saul, S. Savage, and G. M. Voelker, Identifying suspicious URLs, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553462

H. Mania, X. Pan, D. Papailiopoulos, B. Recht, K. Ramchandran et al., Perturbed Iterate Analysis for Asynchronous Stochastic Optimization, SIAM Journal on Optimization, vol.27, issue.4, 2015.
DOI : 10.1137/16M1057000

F. Niu, B. Recht, C. Re, and S. Wright, Hogwild: a lock-free approach to parallelizing stochastic gradient descent, NIPS, 2011.

S. J. Reddi, A. Hefny, S. Sra, B. Póczos, and A. Smola, On variance reduction in stochastic gradient descent and its asynchronous variants, NIPS, 2015.

M. Schmidt, N. L. Roux, and F. Bach, Minimizing finite sums with the stochastic average gradient, F. Math. Program, 2016.
DOI : 10.1007/s10107-016-1030-6
URL : https://hal.archives-ouvertes.fr/hal-00860051

S. Shalev-shwartz and T. Zhang, Stochastic dual coordinate ascent methods for regularized loss, JMLR, vol.14, pp.567-599, 2013.

S. Zhao and W. Li, Fast asynchronous parallel stochastic gradient descent, AAAI, 2016.