Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

Manuel Lopes; Tobias Lang; Marc Toussaint; Pierre-Yves Oudeyer

Communication Dans Un Congrès Année : 2012

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

(1) , (2) , (2) , (1)

1
2

Manuel Lopes

Fonction : Auteur
PersonId : 1873
IdHAL : manuel-lopes
ORCID : 0000-0002-6238-8974
IdRef : 188282947

Flowing Epigenetic Robots and Systems

Tobias Lang

Fonction : Auteur

Machine Learning and Robotics [Stuttgart]

Marc Toussaint

Fonction : Auteur

Machine Learning and Robotics [Stuttgart]

Pierre-Yves Oudeyer

Fonction : Auteur
PersonId : 6675
IdHAL : pyoudeyer
ORCID : 0000-0002-9404-7613
IdRef : 081674481

Flowing Epigenetic Robots and Systems

Résumé

Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as R-MAX base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a "sanity check" theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.

Domaines

Apprentissage [cs.LG]

Fichier principal

nips.pdf (222.53 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Manuel Lopes : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00755248

Soumis le : mardi 20 novembre 2012-17:33:00

Dernière modification le : mercredi 15 mars 2023-08:50:07

Archivage à long terme le : jeudi 21 février 2013-12:30:43

Dates et versions

hal-00755248 , version 1 (20-11-2012)

Identifiants

HAL Id : hal-00755248 , version 1

Citer

Manuel Lopes, Tobias Lang, Marc Toussaint, Pierre-Yves Oudeyer. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress. Neural Information Processing Systems (NIPS), Dec 2012, Lake Tahoe, United States. ⟨hal-00755248⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENSTA INRIA PARISTECH INRIA2

488 Consultations

416 Téléchargements

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager