A. Achille and S. Soatto, Emergence of invariance and disentanglement in deep representations, Journal of Machine Learning Research, vol.19, issue.50, pp.1-34, 2018.

P. Alquier and G. Biau, Sparse single-index model, Journal of Machine Learning Research, vol.14, pp.243-280, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00556652

P. Alquier and B. Guedj, Simpler PAC-Bayesian bounds for hostile data, Machine Learning, vol.107, pp.887-902, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01385064

P. Alquier and K. Lounici, PAC-Bayesian theorems for sparse regression estimation with exponential weights, Electronic Journal of Statistics, vol.5, pp.127-145, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00607297

P. Alquier, J. Ridgway, and N. Chopin, On the properties of variational approximations of Gibbs posteriors, The Journal of Machine Learning Research, vol.17, issue.1, pp.8374-8414, 2016.
URL : https://hal.archives-ouvertes.fr/hal-02403354

A. Ambroladze, E. Parrado-hernández, and J. Shawe-taylor, Tighter PAC-Bayes bounds, Advances in Neural Information Processing Systems, NIPS, pp.9-16, 2007.

J. Audibert and O. Bousquet, Combining PAC-Bayesian and generic chaining bounds, Journal of Machine Learning Research, 2007.

L. Bégin, P. Germain, F. Laviolette, and J. Roy, PAC-Bayesian theory for transductive learning, AISTATS, 2014.

L. Bégin, P. Germain, F. Laviolette, and J. Roy, PAC-Bayesian bounds based on the Rényi divergence, AISTATS, 2016.

M. Belkin, D. Hsu, S. Ma, and S. Mandal, Reconciling modern machine learning and the bias-variance trade-off, 2018.

O. Bousquet and A. Elisseeff, Stability and generalization, Journal of machine learning research, vol.2, pp.499-526, 2002.

O. Catoni, A PAC-Bayesian approach to adaptive classification, 2003.

O. Catoni, Statistical Learning Theory and Stochastic Optimization, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00104952

O. Catoni, PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning, Lecture notes -Monograph Series. Institute of Mathematical Statistics, vol.56, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00206119

I. Csiszár, I-divergence geometry of probability distributions and minimization problems, Annals of Probability, vol.3, pp.146-158, 1975.

P. Derbeko, R. El-yaniv, and R. Meir, Explicit learning curves for transduction and application to clustering and compression algorithms, J. Artif. Intell. Res. (JAIR), vol.22, 2004.

M. D. Donsker and S. S. Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, Communications on Pure and Applied Mathematics, vol.28, 1975.

G. K. Dziugaite and D. M. Roy, Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data, Proceedings of Uncertainty in Artificial Intelligence (UAI), 2017.

G. K. Dziugaite and D. M. Roy, Data-dependent PAC-Bayes priors via differential privacy, NeurIPS, 2018.

G. K. Dziugaite and D. M. Roy, Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors, International Conference on Machine Learning, pp.1376-1385, 2018.

M. M. Fard and J. Pineau, PAC-Bayesian model selection for reinforcement learning, Advances in Neural Information Processing Systems (NIPS), 2010.

M. M. Fard, J. Pineau, and C. Szepesvári, PAC-Bayesian Policy Evaluation for Reinforcement Learning, UAI, Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, pp.195-202, 2011.

S. Gerchinovitz, Prédiction de suites individuelles et cadre statistique classique :étude de quelques liens autour de la régression parcimonieuse et des techniques d'agrégation, 2011.

P. Germain, Généralisations de la théorie PAC-bayésienne pour l'apprentissage inductif, l'apprentissage transductif et l'adaptation de domaine, 2015.

P. Germain, A. Lacasse, F. Laviolette, and M. Marchand, PAC-Bayesian learning of linear classifiers, Proceedings of the 26th Annual International Conference on Machine Learning, ICML, 2009.

P. Germain, A. Lacasse, M. Marchand, S. Shanian, and F. Laviolette, From PAC-Bayes bounds to KL regularization, Advances in Neural Information Processing Systems, pp.603-610, 2009.

P. Germain, A. Habrard, F. Laviolette, and E. Morvant, A new PAC-Bayesian perspective on domain adaptation, Proceedings of International Conference on Machine Learning, vol.48, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01163722

M. Ghavamzadeh, S. Mannor, J. Pineau, and A. Tamar, Bayesian reinforcement learning: A survey. Foundations and Trends in Machine Learning, vol.8, pp.359-483, 2015.

B. Guedj and P. Alquier, PAC-Bayesian estimation and prediction in sparse additive models, Electron. J. Statist, vol.7, pp.264-291, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00722969

B. Guedj and S. Robbiano, PAC-Bayesian high dimensional bipartite ranking, Journal of Statistical Planning and Inference, vol.196, pp.70-86, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01226472

M. Higgs and J. Shawe-taylor, A PAC-Bayes bound for tailored density estimation, Proceedings of the International Conference on Algorithmic Learning Theory (ALT), 2010.

A. Lacasse, F. Laviolette, M. Marchand, P. Germain, and N. Usunier, PAC-Bayes bounds for the risk of the majority vote and the variance of the Gibbs classifier, Advances in Neural information processing systems, pp.769-776, 2007.

J. Langford, Tutorial on practical prediction theory for classification, Journal of Machine Learning Research, 2005.

J. Langford and M. Seeger, Bounds for averaging classifiers, 2001.

J. Langford and J. Shawe-taylor, PAC-Bayes & margins, Advances in Neural Information Processing Systems (NIPS), 2002.

G. Lever, F. Laviolette, and J. Shawe-taylor, Distribution-dependent PAC-Bayes priors, International Conference on Algorithmic Learning Theory, pp.119-133, 2010.

G. Lever, F. Laviolette, and J. Shawe-taylor, Tighter PAC-Bayes bounds through distribution-dependent priors, Theoretical Computer Science, vol.473, pp.4-28, 2013.

C. Li, W. Jiang, and M. Tanner, General oracle inequalities for Gibbs posterior with application to ranking, Conference on Learning Theory, pp.512-521, 2013.

L. Li, B. Guedj, and S. Loustau, A quasi-Bayesian perspective to online clustering, Electron. J. Statist, vol.12, issue.2, pp.3071-3113, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01264233

B. London, A PAC-Bayesian analysis of randomized learning with application to stochastic gradient descent, Advances in Neural Information Processing Systems, pp.2931-2940, 2017.

B. London, B. Huang, B. Taskar, and L. Getoor, PAC-Bayesian collective stability, Artificial Intelligence and Statistics, pp.585-594, 2014.

A. Maurer, A note on the PAC-Bayesian Theorem, 2004.

D. Mcallester, Some PAC-Bayesian theorems, Proceedings of the International Conference on Computational Learning Theory (COLT), 1998.

D. Mcallester, References IV D. McAllester. PAC-Bayesian stochastic model selection, Machine Learning, vol.37, 1999.

D. Mcallester, Simplified PAC-Bayesian margin bounds, COLT, 2003.

B. Neyshabur, S. Bhojanapalli, D. A. Mcallester, and N. Srebro, Exploring generalization in deep learning, Advances in Neural Information Processing Systems, pp.5947-5956, 2017.

E. Parrado-hernández, A. Ambroladze, J. Shawe-taylor, and S. Sun, PAC-Bayes bounds with data dependent priors, Journal of Machine Learning Research, vol.13, pp.3507-3531, 2012.

O. Rivasplata, E. Parrado-hernandez, J. Shawe-taylor, S. Sun, and C. Szepesvari, PAC-Bayes bounds for stable algorithms with instance-dependent priors, Advances in Neural Information Processing Systems, pp.9214-9224, 2018.

M. Seeger, PAC-Bayesian generalization bounds for gaussian processes, Journal of Machine Learning Research, vol.3, pp.233-269, 2002.

M. Seeger, Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations, 2003.

Y. Seldin and N. Tishby, PAC-Bayesian analysis of co-clustering and beyond, Journal of Machine Learning Research, vol.11, pp.3595-3646, 2010.

Y. Seldin, P. Auer, F. Laviolette, J. Shawe-taylor, and R. Ortner, PAC-Bayesian analysis of contextual bandits, Advances in Neural Information Processing Systems (NIPS), 2011.

Y. Seldin, F. Laviolette, N. Cesa-bianchi, J. Shawe-taylor, and P. Auer, PAC-Bayesian inequalities for martingales, IEEE Transactions on Information Theory, vol.58, issue.12, pp.7086-7093, 2012.

J. Shawe-taylor and D. Hardoon, Pac-bayes analysis of maximum entropy classification, Proceedings on the International Conference on Artificial Intelligence and Statistics (AISTATS), 2009.

J. Shawe-taylor and R. C. Williamson, A PAC analysis of a Bayes estimator, Proceedings of the 10th annual conference on Computational Learning Theory, pp.2-9, 1997.

J. Shawe-taylor, P. L. Bartlett, R. C. Williamson, and M. Anthony, Structural risk minimization over data-dependent hierarchies, IEEE Transactions on Information Theory, vol.44, issue.5, 1998.

N. Thiemann, C. Igel, O. Wintenberger, and Y. Seldin, A Strongly Quasiconvex PAC-Bayesian Bound, International Conference on Algorithmic Learning Theory, ALT, pp.466-492, 2017.

N. Tishby, F. Pereira, and W. Bialek, The information bottleneck method, Allerton Conference on Communication, Control and Computation, 1999.

L. G. Valiant, A theory of the learnable, Communications of the ACM, vol.27, issue.11, pp.1134-1142, 1984.