P. Auer, P. Nicoì-o-cesa-bianchi, and . Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

M. Gheshlaghi-azar, A. Lazaric, and E. Brunskill, Online stochastic optimization under correlated bandit feedback, International Conference on Machine Learning, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01080138

S. Bubeck, R. Munos, and G. Stoltz, Pure exploration in finitely-armed and continuous-armed bandits, Theoretical Computer Science, vol.412, issue.19, pp.1832-1852, 2011.
DOI : 10.1016/j.tcs.2010.12.059
URL : https://hal.archives-ouvertes.fr/hal-00609550

S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári, X -armed bandits, Journal of Machine Learning Research, vol.12, pp.1587-1627, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00450235

S. Bubeck, G. Stoltz, and J. Yu, Lipschitz Bandits without the Lipschitz Constant, Algorithmic Learning Theory, 2011.
DOI : 10.1090/S0002-9904-1952-09620-8
URL : https://hal.archives-ouvertes.fr/hal-00595692

A. D. Bull, Adaptive-treed bandits, Bernoulli, vol.21, issue.4, pp.2289-2307, 2015.
DOI : 10.3150/14-BEJ644SUPP
URL : http://arxiv.org/pdf/1302.2489

R. Combes and A. Ere, Unimodal bandits: Regret lower bounds and optimal algorithms, International Conference on Machine Learning, 2014.
DOI : 10.1145/2745844.2745847
URL : https://hal.archives-ouvertes.fr/hal-01092662

P. Coquelin and R. Munos, Bandit algorithms for tree search, In Uncertainty in Artificial Intelligence, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00150207

R. Kleinberg, A. Slivkins, and E. Upfal, Multi-armed bandit problems in metric spaces, Symposium on Theory Of Computing, 2008.
DOI : 10.1145/1374376.1374475
URL : http://www.cs.cornell.edu/~rdk/papers/bandits-lip.pdf

L. Kocsis and C. Szepesvári, Bandit Based Monte-Carlo Planning, European Conference on Machine Learning, 2006.
DOI : 10.1007/11871842_29
URL : https://link.springer.com/content/pdf/10.1007%2F11871842_29.pdf

R. Munos, Optimistic optimization of deterministic functions without the knowledge of its smoothness, Neural Information Processing Systems, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00830143

R. Munos, From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning, Machine Learning, pp.1-130, 2014.
DOI : 10.1561/2200000038
URL : https://hal.archives-ouvertes.fr/hal-00747575

P. Preux, R. Munos, and M. Valko, Bandits attack function optimization, 2014 IEEE Congress on Evolutionary Computation (CEC), 2014.
DOI : 10.1109/CEC.2014.6900558
URL : https://hal.archives-ouvertes.fr/hal-00978637

A. Slivkins, Multi-armed bandits on implicit metric spaces, Neural Information Processing Systems, 2011.

M. Valko, A. Carpentier, and R. Munos, Stochastic simultaneous optimistic optimization, International Conference on Machine Learning, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00789606