Compatible Reward Inverse Reinforcement Learning

Alberto Maria Metelli; Matteo Pirotta; Marcello Restelli

Communication Dans Un Congrès Année : 2017

Compatible Reward Inverse Reinforcement Learning

(1) , (2) , (1)

1
2

Alberto Maria Metelli

Fonction : Auteur
PersonId : 1024223

Department of Electronics, Information, and Bioengineering [Milano]

Matteo Pirotta

Fonction : Auteur
PersonId : 1023840

Sequential Learning

Marcello Restelli

Fonction : Auteur
PersonId : 960707

Department of Electronics, Information, and Bioengineering [Milano]

Résumé

Inverse Reinforcement Learning (IRL) is an effective approach to recover a reward function that explains the behavior of an expert by observing a set of demonstrations. This paper is about a novel model-free IRL approach that, differently from most of the existing IRL algorithms, does not require to specify a function space where to search for the expert's reward function. Leveraging on the fact that the policy gradient needs to be zero for any optimal policy, the algorithm generates a set of basis functions that span the subspace of reward functions that make the policy gradient vanish. Within this subspace, using a second-order criterion, we search for the reward function that penalizes the most a deviation from the expert's policy. After introducing our approach for finite domains, we extend it to continuous ones. The proposed approach is empirically compared to other IRL methods both in the (finite) Taxi domain and in the (continuous) Linear Quadratic Gaussian (LQG) and Car on the Hill environments.

Domaines

Machine Learning [stat.ML]

Fichier principal

6800-compatible-reward-inverse-reinforcement-learning.pdf (452.54 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alessandro Lazaric : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01653328

Soumis le : vendredi 1 décembre 2017-12:17:50

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-01653328 , version 1 (01-12-2017)

Identifiants

HAL Id : hal-01653328 , version 1

Citer

Alberto Maria Metelli, Matteo Pirotta, Marcello Restelli. Compatible Reward Inverse Reinforcement Learning. The Thirty-first Annual Conference on Neural Information Processing Systems - NIPS 2017, Dec 2017, Long Beach, United States. ⟨hal-01653328⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

209 Consultations

248 Téléchargements

Compatible Reward Inverse Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager