Feature learning for multi-task inverse reinforcement learning

Olivier Mangin 1 Pierre-Yves Ouedeyer 1
1 Flowers - Flowing Epigenetic Robots and Systems
Inria Bordeaux - Sud-Ouest, U2IS - Unité d'Informatique et d'Ingénierie des Systèmes
Abstract : In this paper we study the question of life long learning of behaviors from human demonstrations by an intelligent system. One approach is to model the observed demonstrations by a stationary policy. Inverse rein-forcement learning, on the other hand, searches a reward function that makes the observed policy closed to optimal in the corresponding Markov decision process. This approach provides a model of the task solved by the demonstrator and has been shown to lead to better generalization in un-known contexts. However both approaches focus on learning a single task from the expert demonstration. In this paper we propose a feature learn-ing approach for inverse reinforcement learning in which several different tasks are demonstrated, but in which each task is modeled as a mixture of several, simpler, primitive tasks. We present an algorithm based on an al-ternate gradient descent to learn simultaneously a dictionary of primitive tasks (in the form of reward functions) and their combination into an ap-proximation of the task underlying observed behavior. We illustrate how this approach enables efficient re-use of knowledge from previous demon-strations. Namely knowledge on tasks that were previously observed by the learner is used to improve the learning of a new composite behavior, thus achieving transfer of knowledge between tasks.
Type de document :
Pré-publication, Document de travail
2014
Liste complète des métadonnées

https://hal.inria.fr/hal-01098040
Contributeur : Olivier Mangin <>
Soumis le : jeudi 14 février 2019 - 16:17:58
Dernière modification le : mardi 26 février 2019 - 10:46:24

Fichier

firl.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale - Pas de modification 4.0 International License

Identifiants

  • HAL Id : hal-01098040, version 1

Collections

Citation

Olivier Mangin, Pierre-Yves Ouedeyer. Feature learning for multi-task inverse reinforcement learning. 2014. 〈hal-01098040〉

Partager

Métriques

Consultations de la notice

21

Téléchargements de fichiers

114