Semi-Supervised Apprenticeship Learning - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Semi-Supervised Apprenticeship Learning

Michal Valko
Mohammad Ghavamzadeh
  • Fonction : Auteur
  • PersonId : 868946
Alessandro Lazaric

Résumé

In apprenticeship learning we aim to learn a good policy by observing the behavior of an expert or a set of experts. In particular, we consider the case where the expert acts so as to maximize an unknown reward function defined as a linear combination of a set of state features. In this paper, we consider the setting where we observe many sample trajectories (i.e., sequences of states) but only one or a few of them are labeled as experts' trajectories. We investigate the conditions under which the remaining unlabeled trajectories can help in learning a policy with a good performance. In particular, we define an extension to the max-margin inverse reinforcement learning proposed by Abbeel and Ng (2004) where, at each iteration, the max-margin optimization step is replaced by a semi-supervised optimization problem which favors classifiers separating clusters of trajectories. Finally, we report empirical results on two grid-world domains showing that the semi-supervised algorithm is able to output a better policy in fewer iterations than the related algorithm that does not take the unlabeled trajectories into account.
Fichier principal
Vignette du fichier
valko2012semi-supervised_published.pdf (287.5 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-00747921 , version 1 (02-11-2012)
hal-00747921 , version 2 (16-01-2013)

Identifiants

  • HAL Id : hal-00747921 , version 2

Citer

Michal Valko, Mohammad Ghavamzadeh, Alessandro Lazaric. Semi-Supervised Apprenticeship Learning. The 10th European Workshop on Reinforcement Learning (EWRL 2012), Jun 2012, Edinburgh, United Kingdom. pp.131-141. ⟨hal-00747921v2⟩
292 Consultations
184 Téléchargements

Partager

Gmail Facebook X LinkedIn More