A. Alaoui, . El, . Mahoney, and W. Michael, Fast randomized kernel methods with statistical guarantees, Neural Information Processing Systems, 2015.

D. Calandriello, A. Lazaric, and M. Valko, Distributed sequential sampling for kernel matrix approximation, International Conference on Artificial Intelligence and Statistics, 2017.

G. Cavallanti, . Cesa-bianchi, . Nicolo, and C. Gentile, Tracking the best hyperplane with a simple budget Perceptron, Machine Learning, pp.143-167, 2007.
DOI : 10.1007/s10994-007-5003-0
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.6201

. Cesa-bianchi, . Nicolo, A. Conconi, and C. Gentile, A Second-Order Perceptron Algorithm, SIAM Journal on Computing, vol.34, issue.3, pp.640-668, 2005.
DOI : 10.1137/S0097539703432542
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.105.8528

O. Dekel, . Shalev-shwartz, . Shai, and Y. Singer, The Forgetron: A Kernel-Based Perceptron on a Budget, SIAM Journal on Computing, vol.37, issue.5, pp.1342-1372, 2008.
DOI : 10.1137/060666998

J. Duchi, . Hazan, . Elad, and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research, vol.12, pp.2121-2159, 2011.

A. Gammerman, Y. Kalnishkan, and V. Vovk, On-line prediction with kernels and the complexity approximation principle, In Uncertainty in Artificial Intelligence, 2004.

M. Ghashami, . Liberty, . Edo, J. M. Phillips, . Woodruff et al., Frequent Directions: Simple and Deterministic Matrix Sketching, SIAM Journal on Computing, vol.45, issue.5, pp.1762-1792, 2016.
DOI : 10.1137/15M1009718
URL : http://arxiv.org/abs/1501.01711

M. Ghashami, . Perry, J. Daniel, and J. Phillips, Streaming kernel principal component analysis, International Conference on Artificial Intelligence and Statistics, 2016.

. Hazan, . Elad, . Kalai, . Adam, . Kale et al., Logarithmic regret algorithms for online convex optimization, Conference on Learning Theory, 2006.
DOI : 10.1007/11776420_37
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.5990

D. Kingma and J. Ba, Adam: A method for stochastic optimization, International Conference on Learning Representations, 2015.

J. Kivinen, A. J. Smola, and R. C. Williamson, Online Learning with Kernels, IEEE Transactions on Signal Processing, vol.52, issue.8, p.52, 2004.
DOI : 10.1109/TSP.2004.830991
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.2062

Q. Le, . Sarlós, . Tamás, and A. J. Smola, Fastfood - Approximating kernel expansions in loglinear time, International Conference on Machine Learning, 2013.

T. Le, . Nguyen, . Tu, . Nguyen, . Vu et al., Dual Space Gradient Descent for Online Learning, Neural Information Processing Systems, 2016.

J. Lu, . Hoi, C. H. Steven, . Wang, . Jialei et al., Large scale online kernel learning, Journal of Machine Learning Research, vol.17, issue.47, pp.1-43, 2016.

. Luo, . Haipeng, A. Agarwal, . Cesa-bianchi, . Nicolo et al., Efficient second-order online learning via sketching, Neural Information Processing Systems, 2016.

Y. Nesterov and A. Nemirovskii, Interior-point polynomial algorithms in convex programming, Society for Industrial and Applied Mathematics, 1994.
DOI : 10.1137/1.9781611970791

F. Orabona and K. Crammer, New adaptive algorithms for online classification, Neural Information Processing Systems, 2010.

F. Orabona, . Keshet, . Joseph, and B. Caputo, The projectron, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008.
DOI : 10.1145/1390156.1390247

A. Rudi, . Camoriano, . Raffaello, and L. Rosasco, Less is more: Nyström computational regularization, Neural Information Processing Systems, 2015.

B. Schölkopf and A. J. Smola, Learning with kernels: Support vector machines, regularization, optimization, and beyond, 2001.

. Srinivas, . Niranjan, . Krause, . Andreas, M. Seeger et al., Gaussian process optimization in the bandit setting: No regret and experimental design, International Conference on Machine Learning, 2010.

J. Tropp and . Aaron, Freedman's inequality for matrix martingales, Electronic Communications in Probability, vol.16, issue.0, pp.262-270, 2011.
DOI : 10.1214/ECP.v16-1624
URL : http://arxiv.org/abs/1101.3039

Z. Wang, . Crammer, . Koby, . Vucetic, and . Slobodan, Breaking the curse of kernelization: Budgeted stochastic gradient descent for large-scale svm training, Journal of Machine Learning Research, vol.13, pp.3103-3131, 2012.

F. Zhdanov and Y. Kalnishkan, An identity for kernel ridge regression, Algorithmic Learning Theory, 2010.
DOI : 10.1007/978-3-642-16108-7_32
URL : http://arxiv.org/abs/1112.1390

C. Zhu and H. Xu, Online gradient descent in function space, 1512.

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, International Conference on Machine Learning, 2003.