Improved Rates for the Stochastic Continuum-Armed Bandit Problem, 20th Conference on Learning Theory, pp.454-468, 2007. ,
DOI : 10.1007/978-3-540-72927-3_33
Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces, 2004. ,
Bandit algorithms for tree search, Proceedings of 23rd Conference on Uncertainty in Artificial Intelligence, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00150207
Stochastic Processes, 1953. ,
Modification of UCT with patterns in Monte-Carlo go, 2006. ,
URL : https://hal.archives-ouvertes.fr/inria-00117266
Nearly tight bounds for the continuum-armed bandit problem, 18th Advances in Neural Information Processing Systems, 2004. ,
Multi-armed bandits in metric spaces, Proceedings of the fourtieth annual ACM symposium on Theory of computing, STOC 08, 2008. ,
DOI : 10.1145/1374376.1374475
Bandit Based Monte-Carlo Planning, Proceedings of the 15th European Conference on Machine Learning, pp.282-293, 2006. ,
DOI : 10.1007/11871842_29
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296