R. Agrawal, The continuum-armed bandit problem, SIAM Journal on Control and Optimization, vol.33, pp.1926-1951, 1995.

P. Auer, R. Ortner, and C. Szepesvári, Improved rates for the stochastic continuumarmed bandit problem, Conference on Learning Theory, 2007.

A. Mohammad-gheshlaghi-azar, E. Lazaric, and . Brunskill, Online stochastic optimization under correlated bandit feedback, International Conference on Machine Learning, 2014.

K. Azuma, Weighted sums of certain dependent random variables, Tohoku Mathematical Journal, vol.19, issue.3, pp.357-367, 1967.

S. Bubeck, R. Munos, and G. Stoltz, Pure exploration in multi-armed bandits problems, Algorithmic Learning Theory, 2009.

S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári, X-armed bandits, Journal of Machine Learning Research, vol.12, pp.1587-1627, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00450235

A. D. Bull, Adaptive-treed bandits, Bernoulli, vol.21, issue.4, pp.2289-2307, 2015.
DOI : 10.3150/14-bej644
URL : https://doi.org/10.3150/14-bej644

. Eric-w-cope, Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces, IEEE Transactions on Automatic Control, vol.54, issue.6, pp.1243-1253, 2009.

J. Grill, M. Valko, and R. Munos, Black-box optimization of noisy functions with unknown smoothness, Neural Information Processing Systems, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01222915

J. Grill, M. Valko, and R. Munos, Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning, Neural Information Processing Systems, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01389107

K. Jamieson and A. Talwalkar, Non-stochastic best arm identification and hyperparameter optimization, International Conference on Artificial Intelligence and Statistics, 2016.

R. Kleinberg, A. Slivkins, and E. Upfal, Multi-armed bandit problems in metric spaces, Symposium on Theory Of Computing, 2008.

R. Kleinberg, A. Slivkins, and E. Upfal, Bandits and experts in metric spaces, Journal of ACM, 2015.

. Robert-d-kleinberg, Nearly tight bounds for the continuum-armed bandit problem, Neural Information Processing Systems, 2005.

L. Kocsis and C. Szepesvári, Bandit-based Monte-Carlo planning, European Conference on Machine Learning, 2006.

O. V. Lepski and V. G. Spokoiny, Optimal pointwise adaptive methods in nonparametric estimation, The Annals of Statistics, vol.25, issue.6, pp.2512-2546, 1997.

O. V. Lepski, Asymptotically minimax adaptive estimation. I: Upper bounds. optimally adaptive estimates, Theory of Probability & Its Applications, vol.36, pp.682-697, 1992.

L. Li, K. Jamieson, G. Desalvo, and A. Talwalkar, Hyperband: Bandit-based configuration evaluation for hyperparameter optimization, International Conference on Learning Representations, 2017.

A. Locatelli and A. Carpentier, Adaptivity to Smoothness in X-armed bandits, Conference on Learning Theory, 2018.