Continuous Upper Con dence Trees

Adrien Couetoux; Jean-Baptiste Hoock; Nataliya Sokolovska; Olivier Teytaud; Nicolas Bonnard

Communication Dans Un Congrès Année : 2011

Continuous Upper Con dence Trees

(1) , (1, 2) , (1, 2) , (1, 2, 3) , (4)

1
2
3
4

Adrien Couetoux

Fonction : Auteur
PersonId : 910214

Laboratoire de Recherche en Informatique

Jean-Baptiste Hoock

Fonction : Auteur
PersonId : 861616

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Nataliya Sokolovska

Fonction : Auteur
PersonId : 761672
ORCID : 0000-0001-8841-1725
IdRef : 175193711

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Olivier Teytaud

Fonction : Auteur
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Department of Electrical Engineering and Computer Science

Nicolas Bonnard

Fonction : Auteur

Artelys

Résumé

Upper Con dence Trees are a very e cient tool for solving Markov Decision Processes; originating in di cult games like the game of Go, it is in particular surprisingly e cient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Con dence Trees to continuous stochastic problems. We (i) show a deceptive problem on which the classical Upper Con dence Tree approach does not work, even with arbitrarily large computational power and with progressive widening (ii) propose an improvement, termed double-progressive widening, which takes care of the compromise between variance (we want in nitely many simulations for each action/state) and bias (we want su ciently many nodes to avoid a bias by the rst nodes) and which extends the classical progressive widening (iii) discuss its consistency and show experimentally that it performs well on the deceptive problem and on experimental benchmarks. We guess that the double-progressive widening trick can be used for other algorithms as well, as a general tool for ensuring a good bias/variance compromise in search algorithms.

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

c0mcts.pdf (174.69 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Adrien Couetoux : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00745206

Soumis le : jeudi 25 octobre 2012-06:32:59

Dernière modification le : lundi 12 février 2024-09:48:04

Archivage à long terme le : samedi 26 janvier 2013-03:00:09

Dates et versions

hal-00745206 , version 1 (25-10-2012)

Identifiants

HAL Id : hal-00745206 , version 1

Citer

Adrien Couetoux, Jean-Baptiste Hoock, Nataliya Sokolovska, Olivier Teytaud, Nicolas Bonnard. Continuous Upper Con dence Trees. Learning and Intelligent Optimization: 5th International Conference, LION 5, Jan 2011, Rome, Italy. ⟨hal-00745206⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 LRI-AO UNIV-PARIS-SACLAY

190 Consultations

221 Téléchargements

Continuous Upper Con dence Trees

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager