Adding Double Progressive Widening to Upper Confidence Trees to Cope with Uncertainty in Planning Problems

Adrien Couetoux 1 Hassen Doghmen 2
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Current state of the art methods in energy policy planning only approximate the problem (Linear Programming on a finite sample of scenarios, Dynamic Programming on an approximation of the problem, etc). Monte-Carlo Tree Search (MCTS [3]) seems to be a potential candidate to converge to an exact solution of these problems ([2]). But how fast, and how do key parameters (double/simple progressive widening) influence the rate of convergence (or even the convergence itself), are still open questions. Also, MCTS completely ignores the features of the problem, including the scale of the objective function. In this paper, we present MCTS, and its extension to continuous/stochastic domains. We show that on problems with continuous action spaces and infinite support of random variables, the "vanilla" version of MCTS fails. We also show how the double progressive widening technique success[2] relies on its widening coefficient. We also study the impact of an unknown variance of the random variables, to see if it affects the optimal choice of the widening coefficients.
Type de document :
Communication dans un congrès
The 9th European Workshop on Reinforcement Learning (EWRL-9), Sep 2011, Athens, Greece. 2011
Liste complète des métadonnées

Littérature citée [7 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00745207
Contributeur : Adrien Couetoux <>
Soumis le : jeudi 25 octobre 2012 - 06:42:26
Dernière modification le : jeudi 11 janvier 2018 - 06:22:14
Document(s) archivé(s) le : samedi 26 janvier 2013 - 03:05:09

Fichier

ewrl2011_submission_29.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00745207, version 1

Citation

Adrien Couetoux, Hassen Doghmen. Adding Double Progressive Widening to Upper Confidence Trees to Cope with Uncertainty in Planning Problems. The 9th European Workshop on Reinforcement Learning (EWRL-9), Sep 2011, Athens, Greece. 2011. 〈hal-00745207〉

Partager

Métriques

Consultations de la notice

264

Téléchargements de fichiers

144