N. Aronszajn, Theory of reproducing kernels, Transactions of the American mathematical society, vol.68, issue.3, pp.337-404, 1950.
DOI : 10.2307/1990404

URL : https://www.ams.org/tran/1950-068-03/S0002-9947-1950-0051437-7/S0002-9947-1950-0051437-7.pdf

F. Bach, Self-concordant analysis for logistic regression, Electronic Journal of Statistics, vol.4, pp.384-414, 2010.
DOI : 10.1214/09-ejs521

URL : https://hal.archives-ouvertes.fr/hal-00426227

F. Bach, Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression, Journal of Machine Learning Research, vol.15, issue.1, pp.595-627, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00804431

M. S. Bartlett, Approximate confidence intervals, Biometrika, vol.40, issue.1/2, pp.12-19, 1953.

G. Blanchard and N. Mücke, Optimal rates for regularization of statistical inverse learning problems, Foundations of Computational Mathematics, vol.18, issue.4, pp.971-1013, 2018.
DOI : 10.1007/s10208-017-9359-7

URL : http://arxiv.org/pdf/1604.04054

L. Bottou and O. Bousquet, The trade-offs of large scale learning, Advances in Neural Information Processing systems, pp.161-168, 2008.

S. Boucheron and P. Massart, A high-dimensional Wilks phenomenon, Probability Theory and Related Fields, vol.150, pp.405-433, 2011.
DOI : 10.1007/s00440-010-0278-7

URL : https://hal.archives-ouvertes.fr/hal-00945509

A. Caponnetto and E. Vito, Optimal rates for the regularized least-squares algorithm, Found. Comput. Math, vol.7, issue.3, pp.331-368, 2007.
DOI : 10.1007/s10208-006-0196-8

URL : http://publications.csail.mit.edu/tmp/MIT-CSAIL-TR-2005-027.pdf

N. Cesa-bianchi, Y. Mansour, and O. Shamir, On the complexity of learning with kernels, Conference on Learning Theory, pp.297-325, 2015.

A. Dieuleveut and F. Bach, Nonparametric stochastic approximation with large step-sizes, The Annals of Statistics, vol.44, issue.4, pp.1363-1399, 2016.
DOI : 10.1214/15-aos1391

URL : https://hal.archives-ouvertes.fr/hal-01053831

S. Fischer and I. Steinwart, Sobolev norm learning rates for regularized least-squares algorithm, 2017.

D. J. Foster, S. Kale, H. Luo, M. Mohri, and K. Sridharan, Logistic regression: The importance of being improper, Proceedings of COLT, 2018.

S. Geman, E. Bienenstock, and R. Doursat, Neural networks and the bias/variance dilemma, Neural Computation, vol.4, issue.1, pp.1-58, 1992.
DOI : 10.1162/neco.1992.4.1.1

L. L. Gerfo, L. Rosasco, F. Odone, E. D. Vito, and A. Verri, Spectral algorithms for supervised learning, Neural Computation, vol.20, issue.7, pp.1873-1897, 2008.
DOI : 10.1162/neco.2008.05-07-517

URL : http://www.dima.unige.it/~devito/pub_files/spectral_finale.pdf

R. Frank, E. M. Hampel, P. J. Ronchetti, W. A. Rousseeuw, and . Stahel, Robust statistics: the approach based on influence functions, vol.196, 2011.

E. Hazan, T. Koren, and K. Y. Levy, Logistic regression: tight bounds for stochastic and online optimization, Proceedings of The 27th Conference on Learning Theory, vol.35, pp.197-209, 2014.

I. Steinwart and C. Scovel, Fast rates for support vector machines using gaussian kernels, The Annals of Statistics, vol.35, issue.2, pp.575-607, 2007.
DOI : 10.1214/009053606000001226

URL : https://doi.org/10.1214/009053606000001226

I. Steinwart, D. R. Hush, and C. Scovel, Optimal rates for regularized least squares regression, Proc. COLT, 2009.

J. A. Tropp, User-friendly tail bounds for sums of random matrices, Foundations of computational mathematics, vol.12, issue.4, pp.389-434, 2012.
DOI : 10.1007/s10208-011-9099-z

URL : https://authors.library.caltech.edu/27190/1/Caltech_ACM_TR_2010_01.pdf

S. Tu, R. Roelofs, S. Venkataraman, and B. Recht, Large scale kernel learning using block coordinate descent, 2016.

S. A. Van-de-geer, High-dimensional generalized linear models and the Lasso, The Annals of Statistics, vol.36, issue.2, pp.614-645, 2008.

W. Aad, . Van-der, and . Vaart, Asymptotic Statistics, vol.3, 2000.

T. Van-erven, P. D. Grünwald, N. A. Mehta, M. D. Reid, and R. C. Williamson, Fast rates in statistical and online learning, Journal of Machine Learning Research, vol.16, pp.1793-1861, 2015.

G. Wahba, Spline Models for Observational Data, vol.59, 1990.

V. Yurinsky, Sums and Gaussian vectors, Lecture Notes in Mathematics, vol.1617, 1995.
DOI : 10.1007/bfb0092599

, First note that using Eq