Skip to Main content Skip to Navigation
Conference papers

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

Manuel Lopes 1 Tobias Lang 2 Marc Toussaint 2 Pierre-Yves Oudeyer 1
1 Flowers - Flowing Epigenetic Robots and Systems
Inria Bordeaux - Sud-Ouest, U2IS - Unité d'Informatique et d'Ingénierie des Systèmes
Abstract : Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as R-MAX base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a "sanity check" theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.
Document type :
Conference papers
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download

https://hal.inria.fr/hal-00755248
Contributor : Manuel Lopes <>
Submitted on : Tuesday, November 20, 2012 - 5:33:00 PM
Last modification on : Wednesday, July 3, 2019 - 10:48:04 AM
Long-term archiving on: : Thursday, February 21, 2013 - 12:30:43 PM

File

nips.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00755248, version 1

Collections

Citation

Manuel Lopes, Tobias Lang, Marc Toussaint, Pierre-Yves Oudeyer. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress. Neural Information Processing Systems (NIPS), Dec 2012, Lake Tahoe, United States. ⟨hal-00755248⟩

Share

Metrics

Record views

528

Files downloads

579