Continuous Rapid Action Value Estimates

Adrien Couetoux; Mario Milone; Matyas Brendel; Hassen Doghmen; Michèle Sebag; Olivier Teytaud

Conference Papers Year : 2011

Continuous Rapid Action Value Estimates

(1) , (1) , (2) , (2) , (2, 1) , (1, 2)

1
2

Adrien Couetoux

Function : Author
PersonId : 910214

Laboratoire de Recherche en Informatique

Mario Milone

Function : Author

Laboratoire de Recherche en Informatique

Matyas Brendel

Function : Author

Machine Learning and Optimisation

Hassen Doghmen

Function : Author

Machine Learning and Optimisation

Michèle Sebag

Function : Author
PersonId : 836537

Machine Learning and Optimisation

Laboratoire de Recherche en Informatique

Olivier Teytaud

Function : Author
PersonId : 581
IdHAL : olivier-teytaud
IdRef : 05971008X

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Abstract

In the last decade, Monte-Carlo Tree Search (MCTS) has revolutionized the domain of large-scale Markov Decision Process problems. MCTS most often uses the Upper Conﬁdence Tree algorithm to handle the exploration versus exploitation trade-off, while a few heuristics are used to guide the exploration in large search spaces. Among these heuristics is Rapid Action Value Estimate (RAVE). This paper is concerned with extending the RAVE heuristics to continuous action and state spaces. The approach is experimentally validated on two artiﬁcial benchmark problems: the treasure hunt game, and a real-world energy management problem.

Keywords

Rapid Action-Value Estimates continuous domains reinforcement learning

Domains

Optimization and Control [math.OC]

Fichier principal

couetoux.pdf (187.05 Ko)

Origin : Files produced by the author(s)

Olivier Teytaud : Connect in order to contact the contributor

https://inria.hal.science/hal-00642459

Submitted on : Wednesday, November 23, 2011-3:31:28 AM

Last modification on : Monday, April 15, 2024-6:04:11 PM

Long-term archiving on: Friday, November 16, 2012-11:51:15 AM

Dates and versions

hal-00642459 , version 1 (23-11-2011)

Identifiers

HAL Id : hal-00642459 , version 1

Cite

Adrien Couetoux, Mario Milone, Matyas Brendel, Hassen Doghmen, Michèle Sebag, et al.. Continuous Rapid Action Value Estimates. The 3rd Asian Conference on Machine Learning (ACML2011), Nov 2011, Taoyuan, Taiwan. pp.19-31. ⟨hal-00642459⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 LRI-AO TDS-MACS UNIV-PARIS-SACLAY

359 View

352 Download

Continuous Rapid Action Value Estimates

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share