Score-based Inverse Reinforcement Learning

Abstract : This paper reports theoretical and empirical results obtained for the score-based Inverse Reinforcement Learning (IRL) algorithm. It relies on a non-standard setting for IRL consisting of learning a reward from a set of globally scored trajec-tories. This allows using any type of policy (optimal or not) to generate trajectories without prior knowledge during data collection. This way, any existing database (like logs of systems in use) can be scored a posteriori by an expert and used to learn a reward function. Thanks to this reward function, it is shown that a near-optimal policy can be computed. Being related to least-square regression, the algorithm (called SBIRL) comes with theoretical guarantees that are proven in this paper. SBIRL is compared to standard IRL algorithms on synthetic data showing that annotations do help under conditions on the quality of the trajectories. It is also shown to be suitable for real-world applications such as the optimisation of a spoken dialogue system.
Type de document :
Communication dans un congrès
International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), May 2016, Singapore, Singapore. 〈http://sis.smu.edu.sg/aamas2016〉
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01406886
Contributeur : Olivier Pietquin <>
Soumis le : vendredi 2 décembre 2016 - 13:32:58
Dernière modification le : mardi 3 juillet 2018 - 11:29:37
Document(s) archivé(s) le : lundi 20 mars 2017 - 16:45:35

Fichier

aamas-score-based.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01406886, version 1

Citation

Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin. Score-based Inverse Reinforcement Learning. International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), May 2016, Singapore, Singapore. 〈http://sis.smu.edu.sg/aamas2016〉. 〈hal-01406886〉

Partager

Métriques

Consultations de la notice

623

Téléchargements de fichiers

193