P. Auer, N. Cesa-bianchi, and P. Fischer, Finite time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

R. Bellman, Dynamic Programming, 1957.

G. Chaslot, J. Hoock, F. Teytaud, and O. Teytaud, On the huge benefit of quasi-random mutations for multimodal optimization with application to gridbased tuning of neurocontrollers, ESANN, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00380125

G. Chaslot, J. Saito, B. Bouzy, J. W. Uiterwijk, and H. J. Van-den-herik, Monte-Carlo Strategies for Computer Go, Proceedings of the 18th BeNeLux Conference on Artificial Intelligence, pp.83-91, 2006.

G. Chaslot, M. Winands, J. Uiterwijk, H. Van-den-herik, and B. Bouzy, Progressive strategies for monte-carlo tree search, Proceedings of the 10th Joint Conference on Information Sciences, pp.655-661, 2007.

R. Coulom, Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, Proceedings of the 5th International Conference on Computers and Games, 2006.
DOI : 10.1007/978-3-540-75538-8_7

URL : https://hal.archives-ouvertes.fr/inria-00116992

F. De-mesmay, A. Rimmel, Y. Voronenko, and M. Püschel, Bandit-based optimization on graphs with application to library performance tuning, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553468

URL : https://hal.archives-ouvertes.fr/inria-00379523

S. Gelly and D. Silver, Combining online and offline knowledge in UCT, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.273-280, 2007.
DOI : 10.1145/1273496.1273531

URL : https://hal.archives-ouvertes.fr/inria-00164003

L. Kocsis and C. Szepesvari, Bandit Based Monte-Carlo Planning, ECML'06, pp.282-293, 2006.
DOI : 10.1007/11871842_29

T. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985.
DOI : 10.1016/0196-8858(85)90002-8

S. Tsai, T. Hsu, and . Hong, The computational intelligence of mogo revealed in taiwan's computer go tournaments, IEEE Transactions on Computational Intelligence and AI in Games, 2009.

P. Rolet, M. Sebag, and O. Teytaud, Optimal active learning through billiards and upper confidence trees in continous domains, Proceedings of the ECML conference, 2009.

P. Rolet, M. Sebag, and O. Teytaud, Optimal robust expensive optimization is tractable, Proceedings of the 11th Annual conference on Genetic and evolutionary computation, GECCO '09, 2009.
DOI : 10.1145/1569901.1570255

URL : https://hal.archives-ouvertes.fr/inria-00374910

F. Teytaud and O. Teytaud, Creating an Upper-Confidence-Tree Program for Havannah, ACG 12, 2009.
DOI : 10.1007/978-3-642-12993-3_7

URL : https://hal.archives-ouvertes.fr/inria-00380539

Y. Wang, J. Audibert, and R. Munos, Algorithms for infinitely many-armed bandits, Advances in Neural Information Processing Systems, 2008.