L. Anselmi, T. Rosasco, and . Poggio, On invariance and selectivity in representation learning, Information and Inference, vol.34, issue.2, pp.134-158, 2016.
DOI : 10.1109/TPAMI.2011.153

F. Bach, On the equivalence between kernel quadrature rules and random feature expansions, Journal of Machine Learning Research (JMLR), vol.18, pp.1-38, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01118276

P. Bartlett, D. J. Foster, and M. Telgarsky, Spectrally-normalized margin bounds for neural networks, Advances in Neural Information Processing Systems (NIPS), 2017.

A. Bietti and J. Mairal, Invariance and stability of deep convolutional representations, Advances in Neural Information Processing Systems (NIPS), 2017.
URL : https://hal.archives-ouvertes.fr/hal-01630265

L. Bo, K. Lai, X. Ren, and D. Fox, Object recognition with hierarchical kernel descriptors, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995719
URL : http://www.cs.washington.edu/homes/lfb/paper/cvpr11.pdf

S. Boucheron, O. Bousquet, and G. Lugosi, Theory of classification: A survey of some recent advances. ESAIM: probability and statistics, pp.323-375, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00017923

J. Bruna and S. Mallat, Invariant Scattering Convolution Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1872-1886, 2013.
DOI : 10.1109/TPAMI.2012.230
URL : http://arxiv.org/pdf/1203.1513

J. Bruna, A. Szlam, and Y. Lecun, Learning stable group invariant representations with convolutional networks, 2013.

Y. Cho and L. K. Saul, Kernel methods for deep learning, Advances in Neural Information Processing Systems (NIPS), 2009.

M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin, and N. Usunier, Parseval networks: Improving robustness to adversarial examples, International Conference on Machine Learning (ICML), 2017.

T. Cohen and M. Welling, Group equivariant convolutional networks, International Conference on Machine Learning (ICML), 2016.

A. Daniely, R. Frostig, and Y. Singer, Toward deeper understanding of neural networks: The power of initialization and a dual view on expressivity, Advances in Neural Information Processing Systems (NIPS), 2016.

A. Daniely, R. Frostig, V. Gupta, and Y. Singer, Random features for compositional kernels, 2017.

J. Diestel and J. J. Uhl, Vector Measures, 1977.

S. Fine and K. Scheinberg, Efficient SVM training using low-rank kernel representations, Journal of Machine Learning Research (JMLR), vol.2, pp.243-264, 2001.

B. Haasdonk and H. Burkhardt, Invariant kernel functions for pattern analysis and??machine learning, Machine Learning, vol.29, issue.1, pp.35-61, 2007.
DOI : 10.1017/CBO9780511809682

R. Kondor and S. Trivedi, On the generalization of equivariance and convolution in neural networks to the action of compact groups, 2018.

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, pp.541-551, 1989.
DOI : 10.1007/BF00133697

T. Liang, T. Poggio, A. Rakhlin, and J. Stokes, Fisher-rao metric, geometry, and complexity of neural networks, 2017.

G. Loosli, S. Canu, and L. Bottou, Training invariant support vector machines using selective sampling, Large Scale Kernel Machines, pp.301-320, 2007.

S. Mallat, Group Invariant Scattering, Communications on Pure and Applied Mathematics, vol.37, issue.10, pp.1331-1398, 2012.
DOI : 10.1137/S0036141002404838
URL : http://arxiv.org/pdf/1101.2286

G. Montavon, M. L. Braun, and K. Müller, Kernel analysis of deep networks, Journal of Machine Learning Research (JMLR), vol.12, pp.2563-2581, 2011.

Y. Mroueh, S. Voinea, and T. A. Poggio, Learning with group invariant features: A kernel perspective, Advances in Neural Information Processing Systems (NIPS), 2015.

K. Muandet, K. Fukumizu, B. Sriperumbudur, and B. Schölkopf, Kernel Mean Embedding of Distributions: A Review and Beyond, Machine Learning, pp.1-141, 2017.
DOI : 10.1561/2200000060

B. Neyshabur, R. Tomioka, and N. Srebro, Norm-based capacity control in neural networks, Conference on Learning Theory (COLT), 2015.

B. Neyshabur, S. Bhojanapalli, D. Mcallester, and N. Srebro, Exploring generalization in deep learning, Advances in Neural Information Processing Systems (NIPS), 2017.

B. Neyshabur, S. Bhojanapalli, D. Mcallester, and N. Srebro, A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks, Proceedings of the International Conference on Learning Representations (ICLR), 2018.

E. Oyallon and S. Mallat, Deep roto-translation scattering for object classification, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298904

A. Rahimi and B. Recht, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems (NIPS), 2007.

A. Raj, A. Kumar, Y. Mroueh, T. Fletcher, and B. Schoelkopf, Local group invariant representations via orbit embeddings, International Conference on Artificial Intelligence and Statistics (AISTATS), 2017.

S. Saitoh, Integral transforms, reproducing kernels and their applications, 1997.

I. J. Schoenberg, Positive definite functions on spheres, Duke Mathematical Journal, vol.9, issue.1, pp.96-108, 1942.
DOI : 10.1215/S0012-7094-42-00908-6

B. Schölkopf, Support Vector Learning, 1997.

B. Schölkopf and A. J. Smola, Learning with kernels: support vector machines, regularization , optimization, and beyond, 2001.

B. Schölkopf, A. Smola, and K. Müller, Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, vol.20, issue.5, pp.1299-1319, 1998.
DOI : 10.1007/BF02281970

S. Shalev-shwartz and S. Ben-david, Understanding machine learning: From theory to algorithms, 2014.
DOI : 10.1017/CBO9781107298019

L. Sifre and S. Mallat, Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.163

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, International Conference on Learning Representations (ICLR), 2014.

A. J. Smola and B. Schölkopf, Sparse greedy matrix approximation for machine learning, Proceedings of the International Conference on Machine Learning (ICML), 2000.

E. M. Stein, Harmonic Analysis: Real-variable Methods, Orthogonality, and Oscillatory Integrals, 1993.

I. Steinwart, P. Thomann, and N. Schmid, Learning with hierarchical gaussian kernels, 2016.

A. Torralba and A. Oliva, Statistics of natural image categories. Network: computation in neural systems, pp.391-412, 2003.

A. Trouvé and L. Younes, Local Geometry of Deformable Templates, SIAM Journal on Mathematical Analysis, vol.37, issue.1, pp.17-59, 2005.
DOI : 10.1137/S0036141002404838

T. Wiatowski and H. Bölcskei, A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction, IEEE Transactions on Information Theory, vol.64, issue.3, pp.1845-1866, 2018.
DOI : 10.1109/TIT.2017.2776228

C. Williams and M. Seeger, Using the Nyström method to speed up kernel machines, Advances in Neural Information Processing Systems (NIPS), 2001.

C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, Understanding deep learning requires rethinking generalization, International Conference on Learning Representations (ICLR), 2017.