Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Julien Audiffren; Michal Valko; Alessandro Lazaric; Mohammad Ghavamzadeh

Communication Dans Un Congrès Année : 2015

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

(1) , (2) , (2) , (2)

1
2

Julien Audiffren

Fonction : Auteur
PersonId : 7257
IdHAL : julien-audiffren
ORCID : 0000-0003-4321-2575
IdRef : 160139732

Centre de Mathématiques et de Leurs Applications

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

Sequential Learning

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Sequential Learning

Mohammad Ghavamzadeh

Fonction : Auteur
PersonId : 868946

Sequential Learning

Résumé

A popular approach to apprenticeship learning (AL) is to formulate it as an inverse reinforcement learning (IRL) problem. The MaxEnt-IRL algorithm successfully integrates the maximum entropy principle into IRL and unlike its predecessors, it resolves the ambiguity arising from the fact that a possibly large number of policies could match the expert's behavior. In this paper, we study an AL setting in which in addition to the expert's trajectories, a number of unsupervised trajectories is available. We introduce MESSI, a novel algorithm that combines MaxEnt-IRL with principles coming from semi-supervised learning. In particular, MESSI integrates the unsupervised data into the MaxEnt-IRL framework using a pairwise penalty on trajectories. Empirical results in a highway driving and grid-world problems indicate that MESSI is able to take advantage of the unsupervised trajectories and improve the performance of MaxEnt-IRL.

Mots clés

semi-supervised learning apprenticeship learning reinforcement learning

Domaines

Machine Learning [stat.ML] Recherche d'information [cs.IR]

Fichier principal

messi-TR.pdf (494.85 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alessandro Lazaric : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01146187

Soumis le : lundi 20 juillet 2015-10:10:21

Dernière modification le : lundi 8 avril 2024-12:24:02

Archivage à long terme le : mercredi 21 octobre 2015-17:00:43

Dates et versions

hal-01146187 , version 1 (20-07-2015)

Identifiants

HAL Id : hal-01146187 , version 1

Citer

Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh. Maximum Entropy Semi-Supervised Inverse Reinforcement Learning. International Joint Conference on Artificial Intelligence, Jul 2015, Bueons Aires, Argentina. ⟨hal-01146187⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA ENS-CACHAN CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-PARIS-SACLAY UNIV-LILLE ANR ENS-PARIS-SACLAY

470 Consultations

463 Téléchargements

Maximum Entropy Semi-Supervised Inverse Reinforcement Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager