F. Anselmi, L. Rosasco, and T. Poggio, On invariance and selectivity in representation learning, Information and Inference, vol.5, issue.2, pp.134-158, 2016.
DOI : 10.1093/imaiai/iaw009

URL : https://academic.oup.com/imaiai/article-pdf/5/2/134/6990886/iaw009.pdf

F. Anselmi, L. Rosasco, C. Tan, and T. Poggio, Deep convolutional networks are hierarchical kernel machines, 2015.

F. Bach, On the equivalence between kernel quadrature rules and random feature expansions, Journal of Machine Learning Research (JMLR), vol.18, pp.1-38, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01118276

P. Bartlett, D. J. Foster, and M. Telgarsky, Spectrally-normalized margin bounds for neural networks, Advances in Neural Information Processing Systems (NIPS), 2017.

A. Bietti and J. Mairal, Invariance and stability of deep convolutional representations, Advances in Neural Information Processing Systems (NIPS), 2017.
URL : https://hal.archives-ouvertes.fr/hal-01630265

L. Bo, K. Lai, X. Ren, and D. Fox, Object recognition with hierarchical kernel descriptors, CVPR 2011, 2011.
DOI : 10.1109/CVPR.2011.5995719

URL : http://www.cs.washington.edu/homes/lfb/paper/cvpr11.pdf

S. Boucheron, O. Bousquet, and G. Lugosi, Theory of classification: A survey of some recent advances. ESAIM: probability and statistics, pp.323-375, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00017923

J. Bruna and S. Mallat, Invariant Scattering Convolution Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1872-1886, 2013.
DOI : 10.1109/TPAMI.2012.230

URL : http://arxiv.org/pdf/1203.1513

J. Bruna, A. Szlam, and Y. Lecun, Learning stable group invariant representations with convolutional networks, 2013.

Y. Cho and L. K. Saul, Kernel methods for deep learning, Advances in Neural Information Processing Systems (NIPS), 2009.

M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin, and N. Usunier, Parseval networks: Improving robustness to adversarial examples, International Conference on Machine Learning (ICML), 2017.

T. Cohen and M. Welling, Group equivariant convolutional networks, International Conference on Machine Learning (ICML), 2016.

A. Daniely, R. Frostig, V. Gupta, and Y. Singer, Random features for compositional kernels, 2017.

A. Daniely, R. Frostig, and Y. Singer, Toward deeper understanding of neural networks: The power of initialization and a dual view on expressivity, Advances in Neural Information Processing Systems (NIPS), 2016.

J. Diestel and J. J. Uhl, Vector Measures, 1977.

S. Fine and K. Scheinberg, Efficient SVM training using low-rank kernel representations, Journal of Machine Learning Research (JMLR), vol.2, pp.243-264, 2001.

B. Haasdonk and H. Burkhardt, Invariant kernel functions for pattern analysis and??machine learning, Machine Learning, vol.29, issue.1, pp.35-61, 2007.
DOI : 10.1017/CBO9780511809682

URL : https://link.springer.com/content/pdf/10.1007%2Fs10994-007-5009-7.pdf

Q. Le, T. Sarlós, and A. Smola, Fastfood?approximating kernel expansions in loglinear time, Proceedings of the International Conference on Machine Learning (ICML), 2013.

Y. Lecun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, vol.1, issue.4, pp.541-551, 1989.
DOI : 10.1007/BF00133697

T. Liang, T. Poggio, A. Rakhlin, and J. Stokes, Fisher-rao metric, geometry, and complexity of neural networks, 2017.

S. Mallat, Group Invariant Scattering, Communications on Pure and Applied Mathematics, vol.37, issue.10, pp.1331-1398, 2012.
DOI : 10.1137/S0036141002404838

URL : http://arxiv.org/pdf/1101.2286

G. Montavon, M. L. Braun, and K. Müller, Kernel analysis of deep networks, Journal of Machine Learning Research (JMLR), vol.12, pp.2563-2581, 2011.

Y. Mroueh, S. Voinea, and T. A. Poggio, Learning with group invariant features: A kernel perspective, Advances in Neural Information Processing Systems (NIPS), 2015.

K. Muandet, K. Fukumizu, B. Sriperumbudur, and B. Schölkopf, Kernel mean embedding of distributions: A review and beyond. Foundations and Trends, Machine Learning, pp.1-141, 2017.

B. Neyshabur, S. Bhojanapalli, D. Mcallester, and N. Srebro, Exploring generalization in deep learning, Advances in Neural Information Processing Systems (NIPS), 2017.

B. Neyshabur, S. Bhojanapalli, D. Mcallester, and N. Srebro, A pac-bayesian approach to spectrallynormalized margin bounds for neural networks, 2017.

B. Neyshabur, R. Tomioka, and N. Srebro, Norm-based capacity control in neural networks, Conference on Learning Theory, 2015.

E. Oyallon and S. Mallat, Deep roto-translation scattering for object classification, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298904

URL : http://arxiv.org/pdf/1412.8659

A. Rahimi and B. Recht, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems (NIPS), 2007.

A. Raj, A. Kumar, Y. Mroueh, T. Fletcher, and B. Schoelkopf, Local group invariant representations via orbit embeddings, International Conference on Artificial Intelligence and Statistics (AISTATS), 2017.

S. Saitoh, Integral transforms, reproducing kernels and their applications, 1997.

I. J. Schoenberg, Positive definite functions on spheres, Duke Mathematical Journal, vol.9, issue.1, pp.96-108, 1942.
DOI : 10.1215/S0012-7094-42-00908-6

B. Schölkopf, Support Vector Learning, 1997.

B. Schölkopf, A. Smola, and K. Müller, Nonlinear Component Analysis as a Kernel Eigenvalue Problem, Neural Computation, vol.20, issue.5, pp.1299-1319, 1998.
DOI : 10.1007/BF02281970

B. Schölkopf and A. J. Smola, Learning with kernels: support vector machines, regularization, optimization, and beyond, 2001.

S. Shalev-shwartz and S. Ben-david, Understanding machine learning: From theory to algorithms, 2014.
DOI : 10.1017/CBO9781107298019

L. Sifre and S. Mallat, Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination, 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013.
DOI : 10.1109/CVPR.2013.163

URL : http://www.cmapx.polytechnique.fr/~sifre/research/cvpr_13_sifre_mallat_final.pdf

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, International Conference on Learning Representations (ICLR), 2014.

A. J. Smola and B. Schölkopf, Sparse greedy matrix approximation for machine learning, Proceedings of the International Conference on Machine Learning (ICML), 2000.

E. M. Stein, Harmonic Analysis: Real-variable Methods, Orthogonality, and Oscillatory Integrals, 1993.

I. Steinwart, P. Thomann, and N. Schmid, Learning with hierarchical gaussian kernels, 2016.

A. Torralba and A. Oliva, Statistics of natural image categories. Network: computation in neural systems, pp.391-412, 2003.
DOI : 10.1088/0954-898x/14/3/302

URL : http://web.mit.edu/torralba/www/ne3302.pdf

C. Williams and M. Seeger, Using the Nyström method to speed up kernel machines, Advances in Neural Information Processing Systems (NIPS), 2001.

C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, Understanding deep learning requires rethinking generalization, International Conference on Learning Representations (ICLR), 2017.

Y. Zhang, J. D. Lee, and M. I. Jordan, 1 -regularized neural networks are improperly learnable in polynomial time, International Conference on Machine Learning (ICML), 2016.