C. Ing, Accumulated prediction errors, information criteria and optimal forecasting for autoregressive time series, The Annals of Statistics, vol.35, issue.3, pp.1238-1277, 2007.
DOI : 10.1214/009053606000001550

URL : http://arxiv.org/abs/0708.2373

V. N. Vapnik, The Nature of Statistical Learning Theory, 1998.

P. Massart, Concentration Inequalities and Model Selection, Lecture Notes in Mathematics, 2006.

O. Catoni, Statistical Learning Theory and Stochastic Optimization, Lecture Notes in Mathematics, vol.1851, 2004.
DOI : 10.1007/b99352

URL : https://hal.archives-ouvertes.fr/hal-00104952

O. Catoni, PAC-Bayesian Supervised Classification (The Thermodynamics of Statistical Learning, Lecture Notes-Monograph Series IMS, vol.56, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00206119

J. Audibert, Annales de l'Institut Henri Poincaré: Probability and, Statistics, vol.40, issue.6, pp.685-736, 2004.

P. Alquier, PAC-Bayesian bounds for randomized empirical risk minimizers, Mathematical Methods of Statistics, vol.17, issue.4, pp.279-304, 2008.
DOI : 10.3103/S1066530708040017

URL : https://hal.archives-ouvertes.fr/hal-00354922

D. S. Modha and E. Masry, Memory-universal prediction of stationary random processes, IEEE Transactions on Information Theory, vol.44, issue.1, pp.117-133, 1998.
DOI : 10.1109/18.650998

J. Dedecker, P. Doukhan, G. Lang, J. R. León, S. Louhichi et al., Weak Dependence, Examples and Applicationsvolume 190, of Lecture Notes in Statistics, 2007.

E. Rio, Comptes Rendus de l'Académie des Sciences de Paris, Srie I 330, pp.905-908, 2000.

D. W. Andrews, Non-strong mixing autoregressive processes, Journal of Applied Probability, vol.21, issue.04, pp.930-934, 1984.
DOI : 10.2307/3212764

J. Dedecker and C. Prieur, Probability Theory and Related Fields 132, pp.203-235, 2005.