Semi-Supervised Apprenticeship Learning

Michal Valko; Mohammad Ghavamzadeh; Alessandro Lazaric

Article Dans Une Revue Journal of Machine Learning Research Année : 2012

Semi-Supervised Apprenticeship Learning

(1) , (1) , (1)

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

Sequential Learning

Mohammad Ghavamzadeh

Fonction : Auteur
PersonId : 868946

Sequential Learning

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Sequential Learning

Résumé

In apprenticeship learning we aim to learn a good policy by observing the behavior of an expert or a set of experts. In particular, we consider the case where the expert acts so as to maximize an unknown reward function defined as a linear combination of a set of state features. In this paper, we consider the setting where we observe many sample trajectories (i.e., sequences of states) but only one or a few of them are labeled as experts' trajectories. We investigate the conditions under which the remaining unlabeled trajectories can help in learning a policy with a good performance. In particular, we define an extension to the max-margin inverse reinforcement learning proposed by Abbeel and Ng (2004) where, at each iteration, the max-margin optimization step is replaced by a semi-supervised optimization problem which favors classifiers separating clusters of trajectories. Finally, we report empirical results on two grid-world domains showing that the semi-supervised algorithm is able to output a better policy in fewer iterations than the related algorithm that does not take the unlabeled trajectories into account.

Domaines

Machine Learning [stat.ML]

Fichier principal

paper.pdf (377.53 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Michal Valko : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00747921

Soumis le : vendredi 2 novembre 2012-19:00:48

Dernière modification le : vendredi 24 mars 2023-14:52:56

Archivage à long terme le : dimanche 3 février 2013-03:37:31

Dates et versions

hal-00747921 , version 1 (02-11-2012)

hal-00747921 , version 2 (16-01-2013)

Identifiants

HAL Id : hal-00747921 , version 1

Citer

Michal Valko, Mohammad Ghavamzadeh, Alessandro Lazaric. Semi-Supervised Apprenticeship Learning. Journal of Machine Learning Research, 2012, The 10th European Workshop on Reinforcement Learning, 24. ⟨hal-00747921v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

293 Consultations

184 Téléchargements

Semi-Supervised Apprenticeship Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager