Skip to Main content Skip to Navigation
Conference papers

Improving the exploration in Upper Confidence Trees

Adrien Couetoux 1 Hassen Doghmen 2 Olivier Teytaud 1, 2, 3 
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : In the standard version of the UCT algorithm, in the case of a continuous set of decisions, the exploration of new decisions is done through blind search. This can lead to very inefficient exploration, par- ticularly in the case of large dimension problems, which often happens in energy management problems, for instance. In an attempt to use the information gathered through past simulations to better explore new de- cisions, we propose a method named Blind Value (BV). It only requires the access to a function that randomly draws feasible decisions. We also implement it and compare it to the original version of continuous UCT. Our results show that it gives a significant increase in convergence speed, in dimensions 12 and 80.
Document type :
Conference papers
Complete list of metadata

Cited literature [8 references]  Display  Hide  Download
Contributor : Adrien Couetoux Connect in order to contact the contributor
Submitted on : Thursday, October 25, 2012 - 6:46:39 AM
Last modification on : Sunday, June 26, 2022 - 11:57:17 AM
Long-term archiving on: : Saturday, January 26, 2013 - 3:37:46 AM


Files produced by the author(s)


  • HAL Id : hal-00745208, version 1



Adrien Couetoux, Hassen Doghmen, Olivier Teytaud. Improving the exploration in Upper Confidence Trees. Learning and Intelligent OptimizatioN Conference LION 6, Jan 2012, Paris, France. ⟨hal-00745208⟩



Record views


Files downloads