P. Alquier and O. Wintenberger, Model selection for weakly dependent time series forecasting, Bernoulli, vol.18, issue.3, pp.883-913, 2012.
URL : https://hal.archives-ouvertes.fr/inria-00386733

P. Alquier, J. Ridgway, and N. Chopin, On the properties of variational approximations of Gibbs posteriors, JMLR, vol.17, issue.239, pp.1-41, 2016.

A. Ambroladze, E. Parrado-hernández, and J. Shawe-taylor, Tighter PAC-Bayes bounds, NIPS, 2006.

P. Bilingsley, Probability and measure, 1986.

C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics), 2006.

O. Catoni, PAC-Bayesian supervised classification: the thermodynamics of statistical learning, Inst. of Mathematical Statistic, vol.56, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00206119

S. Arnak, A. B. Dalalyan, and . Tsybakov, Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity, Machine Learning, vol.72, pp.39-61, 2008.

K. Gintare, D. M. Dziugaite, and . Roy, Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data, UAI, 2017.

P. Germain, A. Lacasse, F. Laviolette, M. Marchand, and J. Roy, Risk bounds for the majority vote: From a PAC-Bayesian analysis to a learning algorithm, JMLR, vol.16, 2015.

P. Germain, F. R. Bach, A. Lacoste, and S. Lacoste-julien, PAC-Bayesian theory meets bayesian inference, NIPS, pp.1876-1884, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01324072

G. H. Golub and C. F. Van-loan, Johns Hopkins Studies in the Mathematical Sciences, 2013.

P. Grünwald, The safe Bayesian -learning the learning rate via the mixability gap, ALT, 2012.

B. Guedj, , 2019.

E. J. Hannan and M. Deistler, The Statistical Theory of Linear Systems, Classics in Applied Mathematics. Society for Industrial and Applied Mathematics, vol.9781611972191, 1988.

V. Kuznetsov and M. Mohri, Generalization bounds for non-stationary mixing processes, Machine Learning, vol.106, pp.93-117, 2017.

V. Kuznetsov and M. Mohri, Theory and algorithms for forecasting time series, 2018.

L. Ljung, System Identification: Theory for the user, 1999.

D. Mcallester, Some PAC-Bayesian theorems, Machine Learning, vol.37, pp.355-363, 1999.

D. Mcallester, Simplified PAC-Bayesian margin bounds, COLT, pp.203-215, 2003.

P. Kevin and . Murphy, Machine learning: a probabilistic perspective, 2012.

M. Seeger, PAC-Bayesian generalization bounds for Gaussian processes, JMLR, vol.3, pp.233-269, 2002.

S. Shalev, -. Shwartz, and S. Ben-david, Understanding Machine Learning: From Theory to Algorithms, 2014.

R. Sheth and R. Khardon, Excess risk bounds for the bayes risk using variational inference in latent gaussian models, NIPS, pp.5151-5161, 2017.

T. Södeström and P. Stoica, System Identification, 1989.

L. G. Valiant, A theory of the learnable, Commununications of the ACM, vol.27, issue.11, pp.1134-1142, 1984.

T. Zhang, Information-theoretic upper and lower bounds for statistical estimation, IEEE Trans. Information Theory, vol.52, issue.4, pp.1307-1321, 2006.