R. Abraham, J. E. Marsden, and T. R. Manifolds, Tensor Analysis, and Applications, vol.75, 2012.

A. G. De, G. Matthews, M. Rowland, J. Hron, R. E. Turner et al., Gaussian process behaviour in wide deep neural networks, International Conference on Learning Representations, 2018.

S. Mei, A. Montanari, and P. Nguyen, A mean field view of the landscape of two-layer neural networks, Proceedings of the National Academy of Sciences, vol.115, issue.33, pp.7665-7671, 2018.

E. Oyallon and S. Mallat, Deep roto-translation scattering for object classification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.2865-2873, 2015.

A. Rahimi and B. Recht, Random features for large-scale kernel machines, Advances in neural information processing systems, pp.1177-1184, 2008.

A. Rahimi and B. Recht, Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning, Advances in neural information processing systems, pp.1313-1320, 2009.

B. Recht, R. Roelofs, L. Schmidt, and V. Shankar, Do ImageNet classifiers generalize to ImageNet?, Proceedings of the 36th International Conference on Machine Learning, pp.5389-5400, 2019.

M. Grant, E. Rotskoff, and . Vanden-eijnden, Neural networks as interacting particle systems: Asymptotic convexity of the loss landscape and universal scaling of the approximation error, Advances in neural information processing systems, 2018.

D. Saad and S. A. Solla, On-line learning in soft committee machines, Physical Review E, vol.52, issue.4, p.4225, 1995.

D. Scieur, V. Roulet, F. Bach, and A. , Integration methods and optimization algorithms, Advances in Neural Information Processing Systems, pp.1109-1118, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01474045

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

J. Sirignano and K. Spiliopoulos, Mean field analysis of neural networks: A central limit theorem, Stochastic Processes and their Applications, 2019.

Y. Yao, L. Rosasco, and A. Caponnetto, On early stopping in gradient descent learning, Constructive Approximation, vol.26, pp.289-315, 2007.

S. Zagoruyko and N. Komodakis, Wide residual networks, Proceedings of the British Machine Vision Conference (BMVC), vol.12, pp.87-88, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01832503

C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, Understanding deep learning requires rethinking generalization, International Conference on Learning Representations, 2017.

D. Zou, Y. Cao, D. Zhou, and Q. Gu, Stochastic gradient descent optimizes over-parameterized deep ReLU networks, Machine Learning Journal, 2019.