Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

Manuel Lopes 1 Tobias Lang 2 Marc Toussaint 2 Pierre-Yves Oudeyer 1
1 Flowers - Flowing Epigenetic Robots and Systems
Inria Bordeaux - Sud-Ouest, U2IS - Unité d'Informatique et d'Ingénierie des Systèmes
Abstract : Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as R-MAX base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a "sanity check" theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.
Type de document :
Communication dans un congrès
Neural Information Processing Systems (NIPS), Dec 2012, Lake Tahoe, United States. 2012
Liste complète des métadonnées

Littérature citée [17 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00755248
Contributeur : Manuel Lopes <>
Soumis le : mardi 20 novembre 2012 - 17:33:00
Dernière modification le : jeudi 16 novembre 2017 - 17:12:03
Document(s) archivé(s) le : jeudi 21 février 2013 - 12:30:43

Fichier

nips.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00755248, version 1

Collections

Citation

Manuel Lopes, Tobias Lang, Marc Toussaint, Pierre-Yves Oudeyer. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress. Neural Information Processing Systems (NIPS), Dec 2012, Lake Tahoe, United States. 2012. 〈hal-00755248〉

Partager

Métriques

Consultations de la notice

290

Téléchargements de fichiers

218