A. Agarwal and L. Bottou, A lower bound for the optimization of finite sums, Proceedings of the International Conferences on Machine Learning (ICML), 2015.

Z. Allen-zhu, Katyusha: the first direct acceleration of stochastic gradient methods, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , STOC 2017, 2017.
DOI : 10.1145/1015330.1015332

URL : http://arxiv.org/pdf/1603.05953

Y. Arjevani and O. Shamir, Dimension-free iteration complexity of finite sum optimization problems, Advances in Neural Information Processing Systems (NIPS), 2016.

A. Auslender, Numerical methods for nondifferentiable convex optimization, Nonlinear Analysis and Optimization, pp.102-126, 1987.
DOI : 10.1007/BFb0121157

F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with Sparsity-Inducing Penalties, Machine Learning, pp.1-106, 2012.
DOI : 10.1561/2200000015

URL : https://hal.archives-ouvertes.fr/hal-00613125

A. Beck and M. Teboulle, A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, pp.183-202, 2009.
DOI : 10.1137/080716542

URL : http://ie.technion.ac.il/%7Ebecka/papers/finalicassp2009.pdf

D. P. Bertsekas, Convex Optimization Algorithms, Athena Scientific, 2015.

A. Chambolle and T. Pock, A remark on accelerated block coordinate descent for computing the proximity operators of a sum of convex functions, SMAI Journal of Computational Mathematics, vol.1, pp.29-54, 2015.
DOI : 10.5802/smai-jcm.3

URL : https://hal.archives-ouvertes.fr/hal-01099182

R. Correa and C. Lemaréchal, Convergence of some algorithms for convex minimization, Mathematical Programming, pp.261-275, 1993.
DOI : 10.1007/BF01585170

A. Defazio, A simple practical accelerated method for finite sums, Advances in Neural Information Processing Systems (NIPS), 2016.

A. Defazio, F. Bach, and S. Lacoste-julien, SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives, Advances in Neural Information Processing Systems (NIPS), 2014.
URL : https://hal.archives-ouvertes.fr/hal-01016843

A. Defazio, J. Domke, and T. S. Caetano, Finito: A faster, permutable incremental gradient method for big data problems, Proceedings of the International Conferences on Machine Learning (ICML), 2014.

O. Devolder, F. Glineur, and Y. Nesterov, First-order methods of smooth convex optimization with inexact oracle, Mathematical Programming, vol.110, issue.3, pp.37-75, 2014.
DOI : 10.1007/978-3-642-82118-9

URL : http://www.ecore.be/DPs/dp_1297333979.pdf

R. Frostig, R. Ge, S. M. Kakade, and A. Sidford, Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization, Proceedings of the International Conferences on Machine Learning (ICML), 2015.

M. Fuentes, J. Malick, and C. Lemaréchal, Descentwise inexact proximal algorithms for smooth optimization, Computational Optimization and Applications, vol.11, issue.1, pp.755-769, 2012.
DOI : 10.1137/1011036

URL : https://hal.archives-ouvertes.fr/hal-00628777

P. Giselsson and M. Fält, Nonsmooth minimization using smooth envelope functions, 2016.

O. Güler, On the Convergence of the Proximal Point Algorithm for Convex Minimization, SIAM Journal on Control and Optimization, vol.29, issue.2, pp.403-419, 1991.
DOI : 10.1137/0329022

O. Güler, New Proximal Point Algorithms for Convex Minimization, SIAM Journal on Optimization, vol.2, issue.4, pp.649-664, 1992.
DOI : 10.1137/0802032

B. He and X. Yuan, An Accelerated Inexact Proximal Point Algorithm for Convex Minimization, Journal of Optimization Theory and Applications, vol.2, issue.2, pp.536-548, 2012.
DOI : 10.1137/080716542

URL : http://www.optimization-online.org/DB_FILE/2010/09/2741.pdf

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems (NIPS), 2013.

G. Lan and Y. Zhou, An optimal randomized incremental gradient method, Mathematical Programming, 2017.
DOI : 10.1007/s10107-014-0839-0

URL : http://arxiv.org/pdf/1507.02000

C. Lemaréchal and C. Sagastizábal, Practical Aspects of the Moreau--Yosida Regularization: Theoretical Preliminaries, SIAM Journal on Optimization, vol.7, issue.2, pp.367-385, 1997.
DOI : 10.1137/S1052623494267127

H. Lin, J. Mairal, and Z. Harchaoui, A universal catalyst for first-order optimization, Advances in Neural Information Processing Systems (NIPS), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01160728

Q. Lin, Z. Lu, and L. Xiao, An Accelerated Randomized Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization, SIAM Journal on Optimization, vol.25, issue.4, pp.2244-2273, 2015.
DOI : 10.1137/141000270

J. , Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning, SIAM Journal on Optimization, vol.25, issue.2, pp.829-855, 2015.
DOI : 10.1137/140957639

J. , End-to-end kernel learning with supervised convolutional kernel networks, Advances in Neural Information Processing Systems (NIPS), 2016.

B. Martinet, Brève communication Régularisation d'inéquations variationnelles par approximations successives. Revue française d'informatique et de recherche opérationnelle, série rouge, pp.154-158, 1970.
DOI : 10.1051/m2an/197004r301541

URL : https://www.esaim-m2an.org/articles/m2an/pdf/1970/03/m2an197004R301541.pdf

J. Moreau, Fonctions convexes duales et points proximaux dans un espace hilbertien, CR Acad. Sci. Paris Sér. A Math, vol.255, pp.2897-2899, 1962.
URL : https://hal.archives-ouvertes.fr/hal-01867195

A. Nemirovskii and D. B. Yudin, Problem complexity and method efficiency in optimization, 1983.

Y. Nesterov, A method of solving a convex programming problem with convergence rate, Soviet Mathematics Doklady, vol.27, issue.1 22, pp.372-376, 1983.

Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, 2004.
DOI : 10.1007/978-1-4419-8853-9

Y. Nesterov, Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems, SIAM Journal on Optimization, vol.22, issue.2, pp.341-362, 2012.
DOI : 10.1137/100802001

Y. Nesterov, Gradient methods for minimizing composite functions, Mathematical Programming, pp.125-161, 2013.
DOI : 10.1109/TIT.2005.864420

C. Paquette, H. Lin, D. Drusvyatskiy, J. Mairal, and Z. Harchaoui, Catalyst acceleration for gradient-based non-convex optimization, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01536017

N. Parikh and S. P. Boyd, Proximal Algorithms, Foundations and Trends?? in Optimization, vol.1, issue.3, pp.123-231, 2014.
DOI : 10.1561/2400000003

P. Richtárik and M. Taká?, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, Mathematical Programming, pp.1-38, 2014.
DOI : 10.1111/j.1467-9868.2005.00503.x

R. T. Rockafellar, Monotone Operators and the Proximal Point Algorithm, SIAM Journal on Control and Optimization, vol.14, issue.5, pp.877-898, 1976.
DOI : 10.1137/0314056

URL : http://www.math.washington.edu/~rtr/papers/rtr-MonoOpProxPoint.pdf

S. Salzo and S. Villa, Inexact and accelerated proximal point algorithms, Journal of Convex Analysis, vol.19, issue.4, pp.1167-1192, 2012.

M. Schmidt, N. L. Roux, and F. Bach, Convergence rates of inexact proximal-gradient methods for convex optimization, Advances in Neural Information Processing Systems (NIPS), 2011.
URL : https://hal.archives-ouvertes.fr/inria-00618152

M. Schmidt, N. L. Roux, and F. Bach, Minimizing finite sums with the stochastic average gradient, Mathematical Programming, vol.160, issue.1, pp.83-112, 2017.
DOI : 10.1007/s10107-016-1030-6

URL : https://hal.archives-ouvertes.fr/hal-00860051

D. Scieur, A. Aspremont, and F. Bach, Regularized nonlinear acceleration, Advances in Neural Information Processing Systems (NIPS), 2016.
URL : https://hal.archives-ouvertes.fr/hal-01384682

S. Shalev-shwartz and T. Zhang, Proximal stochastic dual coordinate ascent. preprint arXiv:1211, 2012.

S. Shalev-shwartz and T. Zhang, Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, Mathematical Programming, pp.105-145, 2016.
DOI : 10.1023/A:1012498226479

URL : http://arxiv.org/pdf/1309.2375

A. Sidi, Vector Extrapolation Methods with Applications, Society for Industrial and Applied Mathematics, 2017.
DOI : 10.1137/1.9781611974966

M. V. Solodov and B. F. Svaiter, A unified framework for some inexact proximal point algorithms. Numerical Functional Analysis and Optimization, pp.1013-1035, 2001.
DOI : 10.1081/nfa-100108320

URL : http://www.cs.wisc.edu/~solodov/solsva01unif.ps

A. Themelis, L. Stella, and P. Patrinos, Forward-backward envelope for the sum of two nonconvex functions: Further properties and nonmonotone line-search algorithms, 2016.

B. E. Woodworth and N. Srebro, Tight complexity bounds for optimizing composite objectives, Advances in Neural Information Processing Systems (NIPS), 2016.