M. Aharon and M. Elad, Sparse and Redundant Modeling of Image Content Using an Image-Signature-Dictionary, SIAM Journal on Imaging Sciences, vol.1, issue.3, pp.228-247, 2008.
DOI : 10.1137/07070156X

M. Aharon, M. Elad, and A. M. Bruckstein, <tex>$rm K$</tex>-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation, IEEE Transactions on Signal Processing, vol.54, issue.11, pp.4311-4322, 2006.
DOI : 10.1109/TSP.2006.881199

F. Bach, Consistency of the group Lasso and multiple kernel learning, Journal of Machine Learning Research, vol.9, pp.1179-1224, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00164735

F. Bach, J. Mairal, and J. Ponce, Convex sparse matrix factorizations, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00345747

D. P. Bertsekas, Nonlinear Programming, Athena Scientific Belmont, 1999.

P. Bickel, Y. Ritov, and A. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector, The Annals of Statistics, vol.37, issue.4, pp.1705-1732, 2009.
DOI : 10.1214/08-AOS620
URL : https://hal.archives-ouvertes.fr/hal-00401585

J. F. Bonnans and A. Shapiro, Optimization Problems with Perturbations: A Guided Tour, SIAM Review, vol.40, issue.2, pp.202-227, 1998.
DOI : 10.1137/S0036144596302644
URL : https://hal.archives-ouvertes.fr/inria-00073819

J. F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems, 2000.
DOI : 10.1007/978-1-4612-1394-9

J. M. Borwein and A. S. Lewis, Convex Analysis and Nonlinear Optimization: Theory and Examples, 2006.

L. Bottou, Online algorithms and stochastic approximations, Online Learning and Neural Networks, 1998.

L. Bottou and O. Bousquet, The trade-offs of large scale learning, Advances in Neural Information Processing Systems, pp.161-168, 2008.

D. M. Bradley and J. A. , Differentiable sparse coding, Advances in Neural Information Processing Systems, pp.113-120, 2009.

S. S. Chen, D. L. Donoho, and M. A. Saunders, Atomic Decomposition by Basis Pursuit, SIAM Journal on Scientific Computing, vol.20, issue.1, pp.33-61, 1999.
DOI : 10.1137/S1064827596304010

K. Chin, S. Devries, J. Fridlyand, P. T. Spellman, R. Roydasgupta et al., Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell, vol.10, issue.6, pp.529-541, 2006.
DOI : 10.1016/j.ccr.2006.10.009
URL : http://doi.org/10.1016/j.ccr.2006.10.009

S. F. Cotter, B. D. Rao, K. Engan, and K. Kreutz-delgado, Sparse solutions to linear inverse problems with multiple measurement vectors, IEEE Transactions on Signal Processing, vol.53, issue.7, pp.2477-2488, 2005.
DOI : 10.1109/TSP.2005.849172

J. M. Danskin, The theory of max-min, and its application to weapons allocation problems. ¨ Okonometrie und Unternehmensforschung, 1967.

A. Aspremont, L. Ghaoui, M. I. Jordan, and G. R. Lanckriet, A Direct Formulation for Sparse PCA Using Semidefinite Programming, SIAM Review, vol.49, issue.3, pp.434-448, 2007.
DOI : 10.1137/050645506

A. Aspremont, F. Bach, and L. Ghaoui, Optimal solutions for sparse principal component analysis, Journal of Machine Learning Research, vol.9, pp.1269-1294, 2008.

J. Duchi, S. Shalev-shwartz, Y. Singer, and T. Chandra, Efficient projections onto the ? 1 -ball for learning in high dimensions, Proceedings of the International Conference on Machine Learning (ICML), 2008.

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least angle regression, Annals of Statistics, vol.32, issue.2, pp.407-499, 2004.

M. Elad and M. Aharon, Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries, IEEE Transactions on Image Processing, vol.15, issue.12, pp.3736-3745, 2006.
DOI : 10.1109/TIP.2006.881969
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.109.6477

K. Engan, S. O. Aase, and J. H. Husoy, Frame based signal compression using method of optimal directions (MOD), ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349), 1999.
DOI : 10.1109/ISCAS.1999.779928

C. Févotte, N. Bertin, and J. L. Durrieu, Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis, Neural Computation, vol.14, issue.3, pp.793-830, 2009.
DOI : 10.1016/j.sigpro.2007.01.024

D. L. Fisk, Quasi-martingales. Transactions of the, pp.359-388, 1965.
DOI : 10.1090/s0002-9947-1965-0192542-5

J. Friedman, T. Hastie, H. Hölfling, and R. Tibshirani, Pathwise coordinate optimization, The Annals of Applied Statistics, vol.1, issue.2, pp.302-332, 2007.
DOI : 10.1214/07-AOAS131

W. J. Fu, Penalized regressions: The bridge versus the Lasso, Journal of Computational and Graphical Statistics, vol.7, pp.397-416, 1998.

J. J. Fuchs, Recovery of Exact Sparse Representations in the Presence of Bounded Noise, IEEE Transactions on Information Theory, vol.51, issue.10, pp.3601-3608, 2005.
DOI : 10.1109/TIT.2005.855614

A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, From few to many: illumination cone models for face recognition under variable lighting and pose, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23, issue.6, pp.643-660, 2001.
DOI : 10.1109/34.927464

G. H. Golub and C. F. Van-loan, Matrix computations, 1996.

R. Grosse, R. Raina, H. Kwong, and A. Y. Ng, Shift-invariant sparse coding for audio classification, Proceedings of the Twenty-third Conference on Uncertainty in Artificial Intelligence (UAI), 2007.

Z. Harchaoui, MéthodesMéthodesà Noyaux pour la Détection, 2008.

Z. Harchaoui and C. Lévy-leduc, Catching change-points with Lasso, Advances in Neural Information Processing Systems, pp.161-168, 2008.

H. Hotelling, RELATIONS BETWEEN TWO SETS OF VARIATES, Biometrika, vol.28, issue.3-4, pp.321-377, 1936.
DOI : 10.1093/biomet/28.3-4.321

P. O. Hoyer, Non-negative sparse coding, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002.
DOI : 10.1109/NNSP.2002.1030067
URL : http://arxiv.org/abs/cs/0202009

P. O. Hoyer, Non-negative matrix factorization with sparseness constraints, Journal of Machine Learning Research, vol.5, pp.1457-1469, 2004.

L. Jacob, G. Obozinski, and J. Vert, Group lasso with overlap and graph lasso, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553431

R. Jenatton, J. Audibert, and F. Bach, Structured variable selection with sparsityinducing norms, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00377732

R. Jenatton, G. Obozinski, and F. Bach, Structured sparse principal component analysis, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00414158

I. T. Jolliffe, N. T. Trendafilov, and M. Uddin, A Modified Principal Component Technique Based on the LASSO, Journal of Computational and Graphical Statistics, vol.12, issue.3, pp.531-547, 2003.
DOI : 10.1198/1061860032148

K. Kavukcuoglu, M. Ranzato, and Y. Lecun, Fast inference in sparse coding algorithms with applications to object recognition, 2008.

Y. Koren, R. Bell, and C. Volinsky, Matrix Factorization Techniques for Recommender Systems, Computer, vol.42, issue.8, pp.30-37, 2009.
DOI : 10.1109/MC.2009.263

H. J. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, 2003.

D. D. Lee and H. S. Seung, Algorithms for non-negative matrix factorization, Advances in Neural Information Processing Systems, pp.556-562, 2001.

H. Lee, A. Battle, R. Raina, and A. Y. Ng, Efficient sparse coding algorithms, Advances in Neural Information Processing Systems, pp.801-808, 2007.

K. C. Lee, J. Ho, and D. Kriegman, Acquiring linear subspaces for face recognition under variable lighting, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, issue.5, pp.684-698, 2005.

M. S. Lewicki and T. J. Sejnowski, Learning Overcomplete Representations, Neural Computation, vol.33, issue.2, pp.337-365, 2000.
DOI : 10.1109/18.119725

C. J. Lin, Projected Gradient Methods for Nonnegative Matrix Factorization, Neural Computation, vol.5, issue.10, pp.2756-2779, 2007.
DOI : 10.1007/BF01584660

N. Maculan and J. R. Galdino-de-paula, A linear-time median-finding algorithm for projecting a vector on the simplex of n, Operations Research Letters, vol.8, issue.4, pp.219-222, 1989.
DOI : 10.1016/0167-6377(89)90064-3

J. R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, revised edition, 1999.

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Discriminative learned dictionaries for local image analysis, 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008.
DOI : 10.1109/CVPR.2008.4587652

J. Mairal, M. Elad, and G. Sapiro, Sparse Representation for Color Image Restoration, IEEE Transactions on Image Processing, vol.17, issue.1, pp.53-69, 2008.
DOI : 10.1109/TIP.2007.911828
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.6724

J. Mairal, G. Sapiro, and M. Elad, Learning Multiscale Sparse Representations for Image and Video Restoration, Multiscale Modeling & Simulation, vol.7, issue.1, pp.214-241, 2008.
DOI : 10.1137/070697653
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.105.6238

J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online dictionary learning for sparse coding, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553463
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.184.5417

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Supervised dictionary learning, Advances in Neural Information Processing Systems, pp.1033-1040, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00322431

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Non-local sparse models for image restoration, 2009 IEEE 12th International Conference on Computer Vision, 2009.
DOI : 10.1109/ICCV.2009.5459452

S. Mallat, A Wavelet Tour of Signal Processing, Second Edition, 1999.

M. Métivier, Semi-martingales, 1983.

R. M. Neal and G. E. Hinton, A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in Graphical Models, pp.355-368, 1998.

Y. Nesterov, Gradient methods for minimizing composite objective function, Center for Operations Research and Econometrics (CORE), 2007.

G. Obozinski, M. J. Wainwright, and M. I. Jordan, Union support recovery in highdimensional multivariate regression, UC Berkeley Technical Report, vol.761, 2008.

G. Obozinski, B. Taskar, and M. I. Jordan, Joint covariate selection and joint subspace selection for multiple classification problems, Statistics and Computing, vol.8, issue.68, 2009.
DOI : 10.1007/s11222-008-9111-x

B. A. Olshausen and D. J. Field, Sparse coding with an overcomplete basis set: A strategy employed by V1?, Vision Research, vol.37, issue.23, pp.3311-3325, 1997.
DOI : 10.1016/S0042-6989(97)00169-7

M. R. Osborne, B. Presnell, and B. A. Turlach, A new approach to variable selection in least squares problems, IMA Journal of Numerical Analysis, vol.20, issue.3, pp.389-403, 2000.
DOI : 10.1093/imanum/20.3.389

G. Peyré, Sparse Modeling of Textures, Journal of Mathematical Imaging and Vision, vol.27, issue.2, pp.17-31, 2009.
DOI : 10.1007/s10851-008-0120-3

M. Protter and M. Elad, Image Sequence Denoising via Sparse and Redundant Representations, IEEE Transactions on Image Processing, vol.18, issue.1, pp.27-36, 2009.
DOI : 10.1109/TIP.2008.2008065
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.159.8253

R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, Self-taught learning, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273592

V. Roth and B. Fischer, The Group-Lasso for generalized linear models, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008.
DOI : 10.1145/1390156.1390263

S. Shalev-shwartz, O. Shamir, N. Srebro, and K. Sridharan, Stochastic convex optimization, 22nd Annual Conference on Learning Theory (COLT), 2009.

K. Sung, Learning and Example Selection for Object and Pattern Recognition, 1996.

R. Tibshirani, Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society. Series B, vol.58, issue.1, pp.267-288, 1996.

R. Tibshirani and P. Wang, Spatial smoothing and hot spot detection for CGH data using the fused lasso, Biostatistics, vol.9, issue.1, pp.18-29, 2008.
DOI : 10.1093/biostatistics/kxm013

R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, Sparsity and smoothness via the fused lasso, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.99, issue.1, pp.91-108, 2005.
DOI : 10.1016/S0140-6736(02)07746-2

J. A. Tropp, Algorithms for simultaneous sparse approximation. part ii: Convex relaxation. Signal Processing, Special Issue " Sparse Approximations in Signal and Image Processing, pp.589-602, 2006.
DOI : 10.1016/j.sigpro.2005.05.030

J. A. Tropp, A. C. Gilbert, and M. J. Strauss, Algorithms for simultaneous sparse approximation . part i: Greedy pursuit. Signal Processing, Special Issue " Sparse Approximations in Signal and Image Processing, pp.572-588, 2006.
DOI : 10.1016/j.sigpro.2005.05.030

B. A. Turlach, W. N. Venables, and S. J. Wright, Simultaneous Variable Selection, Technometrics, vol.47, issue.3, pp.349-363, 2005.
DOI : 10.1198/004017005000000139

A. W. Van and . Vaart, Asymptotic Statistics, 1998.

D. M. Witten, R. Tibshirani, and T. Hastie, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, vol.10, issue.3, pp.515-534, 2009.
DOI : 10.1093/biostatistics/kxp008

T. T. Wu and K. Lange, Coordinate descent algorithms for lasso penalized regression, The Annals of Applied Statistics, vol.2, issue.1, pp.224-244, 2008.
DOI : 10.1214/07-AOAS147SUPP

J. Yang, K. Yu, Y. Gong, and T. Huang, Linear spatial pyramid matching using sparse coding for image classification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009.

M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.58, issue.1, pp.49-67, 2006.
DOI : 10.1198/016214502753479356

R. Zass and A. Shashua, Nonnegative sparse PCA, Advances in Neural Information Processing Systems, pp.1561-1568, 2007.

H. H. Zhang, Y. Liu, Y. Wu, and J. Zhu, Variable selection for the multicategory SVM via adaptive sup-norm regularization, Electronic Journal of Statistics, vol.2, issue.0, pp.149-167, 2008.
DOI : 10.1214/08-EJS122

M. Zibulevsky and B. A. Pearlmutter, Blind Source Separation by Sparse Decomposition in a Signal Dictionary, Neural Computation, vol.1, issue.4, pp.863-882, 2001.
DOI : 10.1016/S0042-6989(97)00169-7

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320, 2005.
DOI : 10.1073/pnas.201162998

H. Zou, T. Hastie, and R. Tibshirani, Sparse Principal Component Analysis, Journal of Computational and Graphical Statistics, vol.15, issue.2, pp.265-286, 2006.
DOI : 10.1198/106186006X113430