A. and M. R. Szepesvari-c, Use of variance estimation in the multi-armed bandit problem, NIPS 2006 Workshop on On-line Trading of Exploration and Exploitation, 2006.

A. P. , C. G. , H. Perez-j, and R. A. Teytaud-o, Grid coevolution for adaptive simulations; application to the building of opening books in the game of go, Proceedings of EvoGames. CAp, 2009.

A. P. Cesa-bianchi-n and . Fischer-p, Finite time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.23, pp.235-256, 2002.

A. A. Teytaud-o, Continuous lunches are free plus the design of optimal optimization algorithms, Algorithmica, 2009.

C. T. Jouandeau-n, On the parallelization of UCT, Proceedings of CGW07, pp.93-101, 2007.

C. G. Saito-j.-t, U. J. Bouzy-b, and . J. Van-den-herik-h, Monte-Carlo Strategies for Computer Go, Proceedings of the 18th BeNeLux Conference on Artificial Intelligence, pp.83-91, 2006.

C. G. Winands-m, U. J. Van-den, and H. H. Bouzy-b, Progressive strategies for monte-carlo tree search, Proceedings of the 10th Joint Conference on Information Sciences, pp.655-661, 2007.

C. G. and W. M. Van-den-herik, Parallel Monte-Carlo Tree Search, Proceedings of the Conference on Computers and Games, 2008.

C. M. Tromp-j, Ladders are PSPACE-complete, Computers and Games, pp.241-249, 2000.

. A. Desrosi-`-eresdesrosi-`-desrosi-`-eres, La politique des grands nombres : histoire de la raison statistique, 2000.

E. and M. S. Mansour-y, Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems, Journal of Machine Learning Research, vol.7, pp.1079-1105, 2006.

G. S. Silver-d, Combining online and offline knowledge in UCT, ICML '07: Proceedings of the 24th international conference on Machine learning, pp.273-280, 2007.

H. Igel-c, Hoeffding and bernstein races for selecting policies in evolutionary direct policy search, ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pp.401-408, 2009.

K. H. Takeuchi-i, Parallel monte-carlo tree search with simulation servers, 13th Game Programming Workshop, 2008.

K. L. Szepesvari-c, Bandit-based monte-carlo planning, ECML'06, pp.282-293, 2006.

L. T. Robbins-h, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, pp.4-22, 1985.

L. , W. , C. G. , H. , R. A. Teytaud-o et al., The Computational Intelligence of MoGo Revealed in Taiwan's Computer Go Tournaments, IEEE Transactions on Computational Intelligence and AI in games, pp.73-89, 2009.

L. D. Sipser-m, Go is polynomial-space hard, J. ACM, vol.27, issue.2, pp.393-401, 1980.

M. V. Szepesv´ariszepesv´ and S. C. Audibert-j.-y, Empirical Bernstein stopping, ICML '08: Proceedings of the 25th international conference on Machine learning, pp.672-679, 2008.

R. P. and S. M. Teytaud-o, Optimal active learning through billiards and upper confidence trees in continous domains, Proceedings of the ECML conference, 2009.

W. Y. Gelly-s, Modifications of UCT and sequence-like simulations for Monte-Carlo Go, IEEE Symposium on Computational Intelligence and Games, pp.175-182, 2007.