M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen et al., TensorFlow: Large-scale machine learning on heterogeneous systems, Josh Levenberg, Dandelion Mané, 2015.

C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, Weight uncertainty in neural network, Proceedings of the 32nd International Conference on Machine Learning, vol.37, pp.1613-1622, 2015.

O. Catoni, Pac-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning, IMS Lecture Notes Monogr. Ser, vol.56, pp.1-163, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00206119

K. Gintare, D. Dziugaite, and . Roy, Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data, Conference on Uncertainty in Artificial Intelligence 33, 2017.

P. Germain, A. Lacasse, F. Laviolette, and M. Marchand, PAC-Bayesian learning of linear classifiers, Proceedings of the 26th Annual International Conference on Machine Learning -ICML '09, pp.1-8, 2009.

P. Germain, F. Bach, A. Lacoste, S. Lacoste-julien, ;. D. Lee et al., PAC-Bayesian theory meets bayesian inference, Advances in Neural Information Processing Systems, vol.29, pp.1884-1892, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01324072

P. Diederik, J. Kingma, and . Ba, Adam: A method for stochastic optimization, 2014.

P. Diederik, M. Kingma, and . Welling, Auto-encoding variational bayes, 2013.

P. Durk, T. Kingma, M. Salimans, and . Welling, Variational dropout and the local reparameterization trick, Advances in Neural Information Processing Systems, vol.28, pp.2575-2583, 2015.

J. Knoblauch, J. Jewson, and T. Damoulas, Generalized Variational Inference: Three arguments for deriving new Posteriors, 2019.

J. Langford and R. Caruana, Not) Bounding the True Error, Advances in Neural Information Processing Systems 14, pp.809-816, 2002.

J. Langford and M. Seeger, Bounds for averaging classifiers, 2001.

G. Letarte, P. Germain, B. Guedj, and F. Laviolette, Dichotomize and generalize: PAC-Bayesian binary activated deep neural networks, Advances in Neural Information Processing Systems, vol.32, pp.6872-6882, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02482352

S. Mohamed, M. Rosca, M. Figurnov, and A. Mnih, Monte carlo gradient estimation in machine learning, 2019.

M. Seeger, J. Langford, and N. Megiddo, An improved predictive accuracy bound for averaging classifiers, Proceedings of the 18th International Conference on Machine Learning, number CONF, pp.290-297, 2001.

J. Ronald and . Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine learning, vol.8, issue.3-4, pp.229-256, 1992.

W. Zhou, V. Veitch, M. Austern, R. P. Adams, and P. Orbanz, Non-vacuous generalization bounds at the ImageNet scale: A PAC-Bayesian compression approach, International Conference on Learning Representations, 2019.