P. Auer, R. Ortner, and C. Szepesvári, Improved Rates for the Stochastic Continuum-Armed Bandit Problem, Lecture Notes in Computer Science, vol.4539, pp.454-468, 2007.
DOI : 10.1007/978-3-540-72927-3_33

R. Bellman, Dynamic Programming, 1957.

D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng, Adaptive Robust Optimization for the Security Constrained Unit Commitment Problem, IEEE Transactions on Power Systems, vol.28, issue.1, pp.52-63, 2013.
DOI : 10.1109/TPWRS.2012.2205021

A. Bourki, M. Coulm, P. Rolet, O. Teytaud, and P. , Vayssì ere. Parameter Tuning by Simple Regret Algorithms and Multiple Simultaneous Hypothesis Testing, ICINCO2010, p.10, 2010.

S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári, Online optimization in x-armed bandits, NIPS, pp.201-208, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00329797

A. Couetoux, J. Hoock, N. Sokolovska, O. Teytaud, and N. Bonnard, Continuous Upper Confidence Trees, LION'11: Proceedings of the 5th International Conference on Learning and Intelligent OptimizatioN, p.page TBA, 2011.
DOI : 10.1016/0196-8858(85)90002-8

URL : https://hal.archives-ouvertes.fr/hal-00835352

R. Coulom, Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, Proceedings of the 5th International Conference on Computers and Games, pp.72-83, 2006.
DOI : 10.1007/978-3-540-75538-8_7

URL : https://hal.archives-ouvertes.fr/inria-00116992

R. Coulom, Computing elo ratings of move patterns in the game of go, Computer Games Workshop, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00149859

R. D. Kleinberg, Nearly tight bounds for the continuum-armed bandit problem, NIPS, 2004.

L. Kocsis and C. Szepesvari, Bandit Based Monte-Carlo Planning, 15th European Conference on Machine Learning (ECML), pp.282-293, 2006.
DOI : 10.1007/11871842_29

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296

C. Lee, M. Wang, G. Chaslot, J. Hoock, A. Rimmel et al., The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments, IEEE Transactions on Computational Intelligence and AI in games, 2009.

O. Madani, S. Hanks, and A. Condon, On the undecidability of probabilistic planning and related stochastic optimization problems, Artificial Intelligence, vol.147, issue.1-2, pp.5-34, 2003.
DOI : 10.1016/S0004-3702(02)00378-8

C. R. Mansley, A. Weinstein, and M. L. Littman, Sample-based planning for continuous action markov decision processes, ICAPS. AAAI, 2011.

A. Weinstein and M. L. Littman, Bandit-based planning and learning in continuous-action markov decision processes, ICAPS. AAAI, 2012.