Direct Policy Iteration with Demonstrations

Jessica Chemali; Alessandro Lazaric

Communication Dans Un Congrès Année : 2015

Direct Policy Iteration with Demonstrations

(1) , (2, 3)

1
2
3

Jessica Chemali

Fonction : Auteur

Computer Science Department - Carnegie Mellon University

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Sequential Learning

Résumé

We consider the problem of learning the optimal policy of an unknown Markov decision process (MDP) when expert demonstrations are available along with interaction samples. We build on classification-based policy iteration to perform a seamless integration of interaction and expert data, thus obtaining an algorithm which can benefit from both sources of information at the same time. Furthermore , we provide a full theoretical analysis of the performance across iterations providing insights on how the algorithm works. Finally, we report an empirical evaluation of the algorithm and a comparison with the state-of-the-art algorithms.

Domaines

Machine Learning [stat.ML]

Fichier principal

DPID_CameraReady.pdf (337.12 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alessandro Lazaric : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01237659

Soumis le : jeudi 3 décembre 2015-15:46:20

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : samedi 29 avril 2017-07:49:10

Dates et versions

hal-01237659 , version 1 (03-12-2015)

Identifiants

HAL Id : hal-01237659 , version 1

Citer

Jessica Chemali, Alessandro Lazaric. Direct Policy Iteration with Demonstrations. IJCAI - 24th International Joint Conference on Artificial Intelligence, Jul 2015, Buenos Aires, Argentina. ⟨hal-01237659⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE ANR

230 Consultations

417 Téléchargements

Direct Policy Iteration with Demonstrations

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager