Dynamic Programming, 1957. ,
Neuro-dynamic Programming, Athena Scientific, 1996. ,
Online optimization in x-armed bandits, Advances in Neural Information Processing Systems 22, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00329797
Progressive Strategies for Monte-Carlo Tree Search, Proceedings of the 10th Joint Conference on Information Sciences, pp.655-661, 2007. ,
Continuous Upper Confidence Trees, LION'11: Proceedings of the 5th International Conference on Learning and Intelligent OptimizatioN, p.page TBA, 2011. ,
DOI : 10.1016/0196-8858(85)90002-8
URL : https://hal.archives-ouvertes.fr/hal-00835352
Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search ,
DOI : 10.1007/978-3-540-75538-8_7
URL : https://hal.archives-ouvertes.fr/inria-00116992
Computing elo ratings of move patterns in the game of go, Computer Games Workshop, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00149859
Combining online and offline knowledge in UCT, Proceedings of the 24th international conference on Machine learning, ICML '07, pp.273-280, 2007. ,
DOI : 10.1145/1273496.1273531
URL : https://hal.archives-ouvertes.fr/inria-00164003