H. Akaike, Information theory and an extension of the maximum likelihood principle, Second International Symposium on Information Theory (Tsahkadsor, 1971), pp.267-281, 1973.

P. Alquier and K. Lounici, PAC-Bayesian bounds for sparse regression estimation with exponential weights, Electronic Journal of Statistics, vol.5, issue.0, pp.127-14511, 2011.
DOI : 10.1214/11-EJS601
URL : https://hal.archives-ouvertes.fr/hal-00465801

Y. Amit and D. Geman, Shape Quantization and Recognition with Randomized Trees, Neural Computation, vol.1, issue.1, pp.1545-1588, 1997.
DOI : 10.1016/0031-3203(90)90098-6
URL : http://www.wisdom.weizmann.ac.il/~vision/courses/2003_2/shape.pdf

P. C. Bellec, Optimal bounds for aggregation of affine estimators. ArXiv e-prints, 2014.

A. Belloni, V. Chernozhukov, and L. Wang, Square-root lasso: pivotal recovery of sparse signals via conic programming, Biometrika, vol.7, issue.4, pp.791-806, 2011.
DOI : 10.1214/07-AOS520

G. Biau, Analysis of a random forests model, J. Mach. Learn. Res, vol.13, pp.1063-1095
URL : https://hal.archives-ouvertes.fr/hal-00476545

G. Biau and L. Devroye, On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification, Journal of Multivariate Analysis, vol.101, issue.10, pp.2499-2518, 2010.
DOI : 10.1016/j.jmva.2010.06.019
URL : https://hal.archives-ouvertes.fr/hal-00559811

G. Biau, L. Devroye, and G. Lugosi, Consistency of random forests and other averaging classifiers, J. Mach. Learn. Res, vol.9, pp.2015-2033, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00355368

L. Breiman, Bagging predictors, Machine Learning, pp.123-1401018054314350, 1996.
DOI : 10.2307/1403680

L. Breiman, Random forests, Machine Learning, pp.5-321010933404324, 2001.

O. Catoni, Statistical learning theory and stochastic optimization, Lecture notes from the 31st Summer School on Probability Theory held in Saint- Flour, 2001.
DOI : 10.1007/b99352
URL : https://hal.archives-ouvertes.fr/hal-00104952

O. Catoni, Pac-Bayesian supervised classification: the thermodynamics of statistical learning Institute of Mathematical Statistics Lecture Notes?Monograph Series, 56, 2007.

N. Cesa-bianchi and G. Lugosi, On Prediction of Individual Sequences, SSRN Electronic Journal, vol.27, issue.6, pp.1865-1895, 1999.
DOI : 10.2139/ssrn.139692

N. Cesa-bianchi, Y. Freund, D. Haussler, D. P. Helmbold, R. E. Schapire et al., How to use expert advice, Journal of the ACM, vol.44, issue.3, pp.427-485, 1997.
DOI : 10.1145/258128.258179

E. Chernousova, Y. Golubev, and E. Krymova, Ordered smoothers with exponential weighting, Electronic Journal of Statistics, vol.7, issue.0, pp.2395-2419, 2013.
DOI : 10.1214/13-EJS849
URL : https://hal.archives-ouvertes.fr/hal-01292430

D. Dai, P. Rigollet, and T. Zhang, Deviation optimal learning using greedy $Q$-aggregation, The Annals of Statistics, vol.40, issue.3, pp.1878-190512, 2012.
DOI : 10.1214/12-AOS1025
URL : http://doi.org/10.1214/12-aos1025

D. Dai, P. Rigollet, L. Xia, and T. Zhang, Aggregation of affine estimators, Electronic Journal of Statistics, vol.8, issue.1, pp.302-32714, 2014.
DOI : 10.1214/14-EJS886

A. S. Dalalyan, SOCP based variance free Dantzig Selector with application to robust estimation, Comptes Rendus Mathematique, vol.350, issue.15-16, pp.15-16785
DOI : 10.1016/j.crma.2012.09.016

A. S. Dalalyan and J. Salmon, Sharp oracle inequalities for aggregation of affine estimators, The Annals of Statistics, vol.40, issue.4, pp.2327-235512
DOI : 10.1214/12-AOS1038SUPP
URL : https://hal.archives-ouvertes.fr/hal-00587225

A. S. Dalalyan and A. B. Tsybakov, Aggregation by Exponential Weighting and Sharp Oracle Inequalities, Learning Theory, pp.97-111978, 2007.
DOI : 10.1007/978-3-540-72927-3_9
URL : https://hal.archives-ouvertes.fr/hal-00160857

A. S. Dalalyan and A. B. Tsybakov, Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity, Machine Learning, pp.39-61, 2008.
DOI : 10.1007/978-3-540-45167-9_23
URL : https://hal.archives-ouvertes.fr/hal-00291504

A. S. Dalalyan and A. B. Tsybakov, Sparse regression learning by aggregation and Langevin Monte-Carlo, Journal of Computer and System Sciences, vol.78, issue.5, pp.1423-1443
DOI : 10.1016/j.jcss.2011.12.023
URL : https://hal.archives-ouvertes.fr/hal-00773553

A. S. Dalalyan, M. Hebiri, K. Meziani, and J. Salmon, Learning heteroscedastic models by convex programming under group sparsity, ICML, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00813908

D. L. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard, Wavelet shrinkage: asymptopia?, J. Roy. Statist. Soc. Ser. B, vol.57, issue.2, pp.301-369, 1995.

Y. Freund, Boosting a Weak Learning Algorithm by Majority, Information and Computation, vol.121, issue.2, pp.256-285, 1995.
DOI : 10.1006/inco.1995.1136
URL : http://www.informatik.uni-bonn.de/III/lehre/AG/Mustererkennung/WS97/../postscript/maj.ps.gz

T. Gasser and H. Müller, Estimating regression functions and their derivatives by the kernel method, Scand. J. Statist, vol.11, issue.3, pp.171-185, 1984.

R. Genuer, Forêts aléatoires : aspects théoriques, sélection de variables et applications, 2011.

S. Gerchinovitz, Prediction of individual sequences and prediction in the statistical framework : some links around sparse regression and aggregation techniques. These URL https, 2011.
URL : https://hal.archives-ouvertes.fr/tel-00653550

C. Giraud, Mixing least-squares estimators when the variance is unknown, Bernoulli, vol.14, issue.4, pp.1089-110708, 2008.
DOI : 10.3150/08-BEJ135
URL : https://hal.archives-ouvertes.fr/hal-00184869

C. Giraud, S. Huet, and N. Verzelen, High-Dimensional Regression with Unknown Variance, Statistical Science, vol.27, issue.4, pp.500-51812
DOI : 10.1214/12-STS398SUPP
URL : https://hal.archives-ouvertes.fr/hal-00626630

Y. Golubev, Exponential weighting and oracle inequalities for projection estimates, Problems of Information Transmission, vol.48, issue.3, pp.269-280, 2012.
DOI : 10.1134/S0032946012030076

Y. Golubev and D. Ostobski, Concentration inequalities for the exponential weighting method, Mathematical Methods of Statistics, vol.23, issue.1, p.2014
DOI : 10.3103/S1066530714010025
URL : https://hal.archives-ouvertes.fr/hal-01292413

B. Guedj and P. Alquier, PAC-Bayesian estimation and prediction in sparse additive models, Electronic Journal of Statistics, vol.7, issue.0, pp.264-29113, 2013.
DOI : 10.1214/13-EJS771
URL : https://hal.archives-ouvertes.fr/hal-00722969

T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learn- ing

D. Hsu, S. M. Kakade, and T. Zhang, A tail inequality for quadratic forms of subgaussian random vectors, Electronic Communications in Probability, vol.17, issue.0, pp.17-2079
DOI : 10.1214/ECP.v17-2079

G. Lecué, Optimal rates of aggregation in classification under low noise assumption, Bernoulli, vol.13, issue.4, pp.1000-102207, 2007.
DOI : 10.3150/07-BEJ6044

G. Leung and A. R. Barron, Information theory and mixing leastsquares regressions, IEEE Trans. Inform. Theory, vol.52878172, issue.8, pp.3396-3410, 2006.
DOI : 10.1109/tit.2006.878172

N. Littlestone and M. K. Warmuth, The Weighted Majority Algorithm, Information and Computation, vol.108, issue.2, pp.212-261, 1994.
DOI : 10.1006/inco.1994.1009

´. E. Nadaraya, ISSN 1066-5307. doi: 10.3103 On non-parametric estimates of density functions and regression curves, Theory of Probability & Its Applications, pp.246-259186, 1965.

A. Nemirovski, Topics in non-parametric statistics, Lectures on probability theory and statistics, pp.85-277, 1998.

P. Rigollet, Inégalités d'oracle, agrégration et adaptation, 2006.

P. Rigollet and A. B. Tsybakov, Linear and convex aggregation of density estimators, Mathematical Methods of Statistics, vol.16, issue.3, pp.260-280, 2007.
DOI : 10.3103/S1066530707030052
URL : https://hal.archives-ouvertes.fr/hal-00068216

P. Rigollet and A. B. Tsybakov, Sparse Estimation by Exponential Weighting, Statistical Science, vol.27, issue.4, pp.558-57512
DOI : 10.1214/12-STS393
URL : http://doi.org/10.1214/12-sts393

R. E. Schapire, The strength of weak learnability, Mach. Learn, vol.5, issue.2, pp.197-227, 1990.

G. Schwarz, Estimating the Dimension of a Model, The Annals of Statistics, vol.6, issue.2, pp.461-464, 1978.
DOI : 10.1214/aos/1176344136

T. Sun and C. Zhang, Scaled sparse linear regression, Biometrika, vol.7, issue.39, pp.879-898
DOI : 10.1214/08-AOS659

R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B, vol.58, pp.267-288, 1994.
DOI : 10.1111/j.1467-9868.2011.00771.x

A. B. Tsybakov, Optimal Rates of Aggregation, Learning Theory and Kernel Machines, pp.303-313, 2003.
DOI : 10.1007/978-3-540-45167-9_23
URL : https://hal.archives-ouvertes.fr/hal-00104867

A. B. Tsybakov, Agrégation d'estimateurs et optimisation stochastique, J. Soc. Fr. Stat. & Rev. Stat. Appl, vol.149, issue.1, pp.3-26, 2008.

V. G. Vovk, AGGREGATING STRATEGIES, Proceedings of the Third Annual Workshop on Computational Learning Theory, COLT '90, pp.371-386, 1990.
DOI : 10.1016/B978-1-55860-146-8.50032-1

]. G. Wahba, Spline models for observational data, CBMS-NSF Regional Conference Series in Applied Mathematics, 1990.
DOI : 10.1137/1.9781611970128

G. S. Watson, Smooth regression analysis, Sankhya Ser. A, vol.26, pp.359-372, 1964.

Y. Yang, Mixing strategies for density estimation, The Annals of Statistics, vol.28, issue.1, pp.75-87, 2000.
DOI : 10.1214/aos/1016120365
URL : http://doi.org/10.1214/aos/1016120365

Y. Yang, Combining Different Procedures for Adaptive Regression, Journal of Multivariate Analysis, vol.74, issue.1, pp.135-161, 2000.
DOI : 10.1006/jmva.1999.1884
URL : https://doi.org/10.1006/jmva.1999.1884

Y. Yang, Adaptive estimation in pattern recognition by combining different procedures, Statist. Sinica, vol.10, issue.4, pp.1069-1089, 2000.

Y. Yang, Adaptive Regression by Mixing, Journal of the American Statistical Association, vol.96, issue.454, pp.574-588, 2001.
DOI : 10.1198/016214501753168262
URL : http://www.public.iastate.edu/~stat/preprint//articles/1999-12.ps

Y. Yang, Regression with multiple candidate models: selecting or mixing? Statist. Sinica, pp.783-809, 2003.

Y. Yang, Aggregating regression procedures to improve performance, Bernoulli, vol.10, issue.1, pp.25-47, 2004.
DOI : 10.3150/bj/1077544602
URL : http://doi.org/10.3150/bj/1077544602

H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.5, issue.2, pp.301-320
DOI : 10.1073/pnas.201162998