Primal Wasserstein Imitation Learning

Robert Dadashi; Léonard Hussenot; Matthieu Geist; Olivier Pietquin

Communication Dans Un Congrès Année : 2020

Primal Wasserstein Imitation Learning

(1) , (1, 2) , (1) , (1)

1
2

Robert Dadashi

Fonction : Auteur correspondant
PersonId : 1092948

Connectez-vous pour contacter l'auteur

Google Research [Paris]

Léonard Hussenot

Fonction : Auteur
PersonId : 1092830

Google Research [Paris]

Scool

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

Google Research [Paris]

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Google Research [Paris]

Résumé

Imitation Learning (IL) methods seek to match the behavior of an agent with that of an expert. In the present work, we propose a new IL method based on a conceptually simple algorithm: Primal Wasserstein Imitation Learning (PWIL), which ties to the primal form of the Wasserstein distance between the expert and the agent state-action distributions. We present a reward function which is derived offline, as opposed to recent adversarial IL algorithms that learn a reward function through interactions with the environment, and which requires little fine-tuning. We show that we can recover expert behavior on a variety of continuous control tasks of the MuJoCo domain in a sample efficient manner in terms of agent interactions and of expert interactions with the environment. Finally, we show that the behavior of the agent we train matches the behavior of the expert with the Wasserstein distance, rather than the commonly used proxy of performance.

Domaines

Informatique [cs]

Fichier principal

2006.04678.pdf (6.94 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Léonard Hussenot : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03162526

Soumis le : lundi 8 mars 2021-16:01:06

Dernière modification le : mercredi 24 janvier 2024-09:54:24

Archivage à long terme le : mercredi 9 juin 2021-19:13:31

Dates et versions

hal-03162526 , version 1 (08-03-2021)

Identifiants

HAL Id : hal-03162526 , version 1
ARXIV : 2006.04678

Citer

Robert Dadashi, Léonard Hussenot, Matthieu Geist, Olivier Pietquin. Primal Wasserstein Imitation Learning. ICLR 2021 - Ninth International Conference on Learning Representations, May 2021, Vienna / Virtual, Austria. ⟨hal-03162526⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 UNIV-LILLE CRISTAL-SCOOL

90 Consultations

154 Téléchargements

Primal Wasserstein Imitation Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager