Continuous Rapid Action Value Estimates

Adrien Couetoux 1 Mario Milone 1 Matyas Brendel 2 Hassen Doghmen 2 Michèle Sebag 2, 1 Olivier Teytaud 1, 2
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : In the last decade, Monte-Carlo Tree Search (MCTS) has revolutionized the domain of large-scale Markov Decision Process problems. MCTS most often uses the Upper Confidence Tree algorithm to handle the exploration versus exploitation trade-off, while a few heuristics are used to guide the exploration in large search spaces. Among these heuristics is Rapid Action Value Estimate (RAVE). This paper is concerned with extending the RAVE heuristics to continuous action and state spaces. The approach is experimentally validated on two artificial benchmark problems: the treasure hunt game, and a real-world energy management problem.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.inria.fr/hal-00642459
Contributor : Olivier Teytaud <>
Submitted on : Wednesday, November 23, 2011 - 3:31:28 AM
Last modification on : Thursday, April 5, 2018 - 12:30:12 PM
Long-term archiving on : Friday, November 16, 2012 - 11:51:15 AM

File

couetoux.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00642459, version 1

Collections

Citation

Adrien Couetoux, Mario Milone, Matyas Brendel, Hassen Doghmen, Michèle Sebag, et al.. Continuous Rapid Action Value Estimates. The 3rd Asian Conference on Machine Learning (ACML2011), Nov 2011, Taoyuan, Taiwan. pp.19-31. ⟨hal-00642459⟩

Share

Metrics

Record views

667

Files downloads

477