Skip to Main content Skip to Navigation
Conference papers

Continuous Upper Con dence Trees

Adrien Couetoux 1 Jean-Baptiste Hoock 1, 2 Nataliya Sokolovska 1, 2 Olivier Teytaud 1, 2, 3 Nicolas Bonnard 4
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Upper Con dence Trees are a very e cient tool for solving Markov Decision Processes; originating in di cult games like the game of Go, it is in particular surprisingly e cient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Con dence Trees to continuous stochastic problems. We (i) show a deceptive problem on which the classical Upper Con dence Tree approach does not work, even with arbitrarily large computational power and with progressive widening (ii) propose an improvement, termed double-progressive widening, which takes care of the compromise between variance (we want in nitely many simulations for each action/state) and bias (we want su ciently many nodes to avoid a bias by the rst nodes) and which extends the classical progressive widening (iii) discuss its consistency and show experimentally that it performs well on the deceptive problem and on experimental benchmarks. We guess that the double-progressive widening trick can be used for other algorithms as well, as a general tool for ensuring a good bias/variance compromise in search algorithms.
Document type :
Conference papers
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/hal-00745206
Contributor : Adrien Couetoux <>
Submitted on : Thursday, October 25, 2012 - 6:32:59 AM
Last modification on : Thursday, July 8, 2021 - 3:48:15 AM
Long-term archiving on: : Saturday, January 26, 2013 - 3:00:09 AM

File

c0mcts.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00745206, version 1

Collections

Citation

Adrien Couetoux, Jean-Baptiste Hoock, Nataliya Sokolovska, Olivier Teytaud, Nicolas Bonnard. Continuous Upper Con dence Trees. Learning and Intelligent Optimization: 5th International Conference, LION 5, Jan 2011, Rome, Italy. ⟨hal-00745206⟩

Share

Metrics

Record views

433

Files downloads

331