A. Beck and M. Teboulle, A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems, SIAM Journal on Imaging Sciences, vol.2, issue.1, pp.183-202, 2009.
DOI : 10.1137/080716542

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3271

A. Beck and M. Teboulle, Smoothing and First Order Methods: A Unified Framework, SIAM Journal on Optimization, vol.22, issue.2, pp.557-580, 2012.
DOI : 10.1137/100818327

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.307.7067

D. P. Bertsekas, Convex Optimization Algorithms, Athena Scientific, 2015.

J. Bonnans, J. C. Gilbert, C. Lemaréchal, and C. A. Sagastizábal, Numerical Optimization: Theoretical and Practical Aspects, 2006.
DOI : 10.1007/978-3-662-05078-1

J. Burke and M. Qian, On the superlinear convergence of the variable metric proximal point algorithm using Broyden and BFGS matrix secant updating, Mathematical Programming, pp.157-181, 2000.
DOI : 10.1007/PL00011373

R. Byrd, S. Hansen, J. Nocedal, and Y. Singer, A Stochastic Quasi-Newton Method for Large-Scale Optimization, SIAM Journal on Optimization, vol.26, issue.2, pp.1008-1031, 2016.
DOI : 10.1137/140954362

URL : http://arxiv.org/abs/1401.7020

R. H. Byrd, J. Nocedal, and F. Oztoprak, An inexact successive quadratic approximation method for L-1 regularized optimization, Mathematical Programming, pp.375-396, 2015.
DOI : 10.1007/s10107-015-0941-y

R. H. Byrd, J. Nocedal, and Y. Yuan, Global Convergence of a Cass of Quasi-Newton Methods on Convex Problems, SIAM Journal on Numerical Analysis, vol.24, issue.5, pp.1171-1190, 1987.
DOI : 10.1137/0724077

X. Chen and M. Fukushima, Proximal quasi-Newton methods for nondifferentiable convex optimization, Mathematical Programming, vol.85, issue.2, pp.313-334, 1999.
DOI : 10.1007/s101070050059

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.4402

A. Defazio, F. Bach, and S. Lacoste-julien, Saga: A fast incremental gradient method with support for non-strongly convex composite objectives, Advances in Neural Information Processing Systems (NIPS), 2014.
URL : https://hal.archives-ouvertes.fr/hal-01016843

A. Defazio, J. Domke, and T. S. Caetano, Finito: A faster, permutable incremental gradient method for big data problems, Proceedings of the International Conferences on Machine Learning (ICML), 2014.

O. Devolder, F. Glineur, and Y. Nesterov, First-order methods of smooth convex optimization with inexact oracle, Mathematical Programming, vol.110, issue.3, pp.37-75, 2014.
DOI : 10.1007/s10107-013-0677-5

J. C. Duchi, P. L. Bartlett, and M. J. Wainwright, Randomized Smoothing for Stochastic Optimization, SIAM Journal on Optimization, vol.22, issue.2, pp.674-701, 2012.
DOI : 10.1137/110831659

URL : http://arxiv.org/abs/1103.4296

M. Elad, Sparse and Redundant Representations, 2010.
DOI : 10.1007/978-1-4419-7011-4

URL : https://hal.archives-ouvertes.fr/inria-00568893

M. P. Friedlander and M. Schmidt, Hybrid Deterministic-Stochastic Methods for Data Fitting, SIAM Journal on Scientific Computing, vol.34, issue.3, pp.1380-1405, 2012.
DOI : 10.1137/110830629

URL : https://hal.archives-ouvertes.fr/inria-00626571

R. Frostig, R. Ge, S. M. Kakade, and A. Sidford, Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization, Proceedings of the International Conferences on Machine Learning (ICML), 2015.

M. Fuentes, J. Malick, and C. Lemaréchal, Descentwise inexact proximal algorithms for smooth optimization, Computational Optimization and Applications, vol.11, issue.1, pp.755-769, 2012.
DOI : 10.1007/s10589-012-9461-3

URL : https://hal.archives-ouvertes.fr/hal-00628777

M. Fukushima and L. Qi, A Globally and Superlinearly Convergent Algorithm for Nonsmooth Convex Minimization, SIAM Journal on Optimization, vol.6, issue.4, pp.1106-1120, 1996.
DOI : 10.1137/S1052623494278839

S. Ghadimi, G. Lan, and H. Zhang, Generalized Uniformly Optimal Methods for Nonlinear Programming, 2015.

R. M. Gower, D. Goldfarb, and P. Richtárik, Stochastic block BFGS: Squeezing more curvature out of data, Proceedings of the International Conferences on Machine Learning (ICML), 2016.

O. Güler, New Proximal Point Algorithms for Convex Minimization, SIAM Journal on Optimization, vol.2, issue.4, pp.649-664, 1992.
DOI : 10.1137/0802032

J. Hiriart-urruty and C. Lemaréchal, Convex Analysis and Minimization Algorithms I, 1996.
DOI : 10.1007/978-3-662-02796-7

J. Hiriart-urruty and C. Lemaréchal, Convex analysis and minimization algorithms. II, 1996.
DOI : 10.1007/978-3-662-06409-2

J. Lee, Y. Sun, and M. Saunders, Proximal Newton-type methods for convex optimization, Advances in Neural Information Processing Systems (NIPS), 2012.

C. Lemaréchal and C. Sagastizábal, Practical Aspects of the Moreau--Yosida Regularization: Theoretical Preliminaries, SIAM Journal on Optimization, vol.7, issue.2, pp.367-385, 1997.
DOI : 10.1137/S1052623494267127

H. Lin, J. Mairal, and Z. Harchaoui, A universal catalyst for first-order optimization, Advances in Neural Information Processing Systems (NIPS), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01160728

D. C. Liu and J. Nocedal, On the limited memory BFGS method for large scale optimization, Mathematical Programming, vol.32, issue.2, pp.503-528, 1989.
DOI : 10.1007/BF01589116

J. Mairal, F. Bach, and J. Ponce, Sparse modeling for image and vision processing. Foundations and Trends in Computer Graphics and Vision, pp.85-283, 2014.
DOI : 10.1561/0600000058

URL : https://hal.archives-ouvertes.fr/hal-01081139

R. Mifflin, A quasi-second-order proximal bundle algorithm, Mathematical Programming, pp.51-72, 1996.
DOI : 10.1007/BF02592098

P. Moritz, R. Nishihara, and M. I. Jordan, A linearly-convergent stochastic L-BFGS algorithm, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2016.

Y. Nesterov, A method of solving a convex programming problem with convergence rate, Soviet Mathematics Doklady, vol.27, issue.1 22, pp.372-376, 1983.

Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, 2004.
DOI : 10.1007/978-1-4419-8853-9

Y. Nesterov, Smooth minimization of non-smooth functions, Mathematical Programming, vol.269, issue.1, pp.127-152, 2005.
DOI : 10.1007/s10107-004-0552-5

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.322.5275

Y. Nesterov, Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems, SIAM Journal on Optimization, vol.22, issue.2, pp.341-362, 2012.
DOI : 10.1137/100802001

Y. Nesterov, Gradient methods for minimizing composite functions, Mathematical Programming, pp.125-161, 2013.
DOI : 10.1007/s10107-012-0629-5

J. Nocedal and S. Wright, Numerical optimization, 2006.
DOI : 10.1007/b98874

M. Razaviyayn, M. Hong, and Z. Luo, A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization, SIAM Journal on Optimization, vol.23, issue.2, pp.1126-1153, 2013.
DOI : 10.1137/120891009

P. Richtárik and M. Taká?, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, Mathematical Programming, pp.1-38, 2014.
DOI : 10.1007/s10107-012-0614-z

R. T. Rockafellar, Monotone Operators and the Proximal Point Algorithm, SIAM Journal on Control and Optimization, vol.14, issue.5, pp.877-898, 1976.
DOI : 10.1137/0314056

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.298.5154

S. Salzo and S. Villa, Inexact and accelerated proximal point algorithms, Journal of Convex Analysis, vol.19, issue.4, pp.1167-1192, 2012.

K. Scheinberg and X. Tang, Practical inexact proximal quasi-Newton method with global complexity analysis, Mathematical Programming, pp.495-529, 2016.
DOI : 10.1007/s10107-016-0997-3

URL : http://arxiv.org/abs/1311.6547

M. Schmidt, D. Kim, and S. Sra, Projected Newton-type methods in machine learning, pp.305-330, 2011.

M. Schmidt, N. L. Roux, and F. Bach, Minimizing finite sums with the stochastic average gradient, Mathematical Programming, vol.160, issue.1, pp.83-112, 2017.
URL : https://hal.archives-ouvertes.fr/hal-00860051

S. Shalev-shwartz and T. Zhang, Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, Mathematical Programming, pp.105-145, 2016.
DOI : 10.1007/s10107-014-0839-0

URL : http://arxiv.org/abs/1309.2375

L. Xiao and T. Zhang, A Proximal Stochastic Gradient Method with Progressive Variance Reduction, SIAM Journal on Optimization, vol.24, issue.4, pp.2057-2075, 2014.
DOI : 10.1137/140961791

URL : http://arxiv.org/abs/1403.4699

J. Yu, S. Vishwanathan, S. Günter, and N. N. Schraudolph, A quasi-Newton approach to non-smooth convex optimization, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008.
DOI : 10.1145/1390156.1390309

URL : http://arxiv.org/pdf/0804.3835v1.pdf

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320, 2005.
DOI : 10.1073/pnas.201162998