Boosted and Reward-regularized Classification for Apprenticeship Learning

Bilal Piot; Matthieu Geist; Olivier Pietquin

Communication Dans Un Congrès Année : 2014

Boosted and Reward-regularized Classification for Apprenticeship Learning

(1, 2) , (2) , (3)

1
2
3

Bilal Piot

Fonction : Auteur
PersonId : 987091

Georgia Tech Lorraine [Metz]

MAchine Learning and Interactive Systems

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

MAchine Learning and Interactive Systems

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Laboratoire d'Informatique Fondamentale de Lille

Résumé

This paper deals with the problem of learning from demonstrations, where an agent called the apprentice tries to learn a behavior from demonstrations of another agent called the expert. To address this problem, we place ourselves into the Markov Decision Process (MDP) framework, which is well suited for sequential decision making problems. A way to tackle this problem is to reduce it to classification but doing so we do not take into account the MDP structure. Other methods which take into account the MDP structure need to solve MDPs which is a difficult task and/or need a choice of features which is problem-dependent. The main contribution of the paper is to extend a large margin approach, which is a classification method, by adding a regularization term which takes into account the MDP structure. The derived algorithm, called Reward-regularized Classification for Apprenticeship Learning (RCAL), does not need to solve MDPs. But, the major advantage is that it can be boosted: this avoids the choice of features, which is a drawback of parametric approaches. A state of the art experiment (Highway) and generic experiments (structured Garnets) are conducted to show the performance of RCAL compared to algorithms from the literature.

Mots clés

Inverse Reinforcement Learning Large margin methods Boosting Learning from Demonstrations

Domaines

Informatique [cs] Sciences de l'ingénieur [physics]

Olivier Pietquin : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01107837

Soumis le : mardi 23 juillet 2019-14:06:55

Dernière modification le : jeudi 13 avril 2023-09:26:12

Dates et versions

hal-01107837 , version 1 (23-07-2019)

Identifiants

HAL Id : hal-01107837 , version 1

Citer

Bilal Piot, Matthieu Geist, Olivier Pietquin. Boosted and Reward-regularized Classification for Apprenticeship Learning. AAMAS 2014 : 13th International Conference on Autonomous Agents and Multiagent Systems, May 2014, Paris, France. pp.1249-1256. ⟨hal-01107837⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC UNIV-LILLE3 CNRS INRIA UNIV-FCOMTE UMI-GTL

4880 Consultations

0 Téléchargements

Boosted and Reward-regularized Classification for Apprenticeship Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager