Learning a Behavioral Repertoire from Demonstrations

Niels Justesen; Miguel González-Duque; Daniel Cabarcas; Jean-Baptiste Mouret; Sebastian Risi

Communication Dans Un Congrès Année : 2020

Learning a Behavioral Repertoire from Demonstrations

(1) , (1) , (2) , (3) , (1)

1
2
3

Niels Justesen

Fonction : Auteur

IT University of Copenhagen

Miguel González-Duque

Fonction : Auteur

IT University of Copenhagen

Daniel Cabarcas

Fonction : Auteur

Universidad Nacional de Colombia Sede Medellín

Jean-Baptiste Mouret

Fonction : Auteur
PersonId : 1495
IdHAL : jb-mouret
ORCID : 0000-0002-2513-027X
IdRef : 137470002

Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment

Sebastian Risi

Fonction : Auteur

IT University of Copenhagen

Résumé

Imitation Learning (IL) is a machine learning approach to learn a policy from a set of demonstrations. IL can be useful to kick-start learning before applying reinforcement learning (RL) but it can also be useful on its own, e.g. to learn to imitate human players in video games. Despite the success of systems that use IL and RL, how such systems can adapt in-between game rounds is a neglected area of study but an important aspect of many strategy games. In this paper, we present a new approach called Behavioral Repertoire Imitation Learning (BRIL) that learns a repertoire of behaviors from a set of demonstrations by augmenting the state-action pairs with behavioral descriptions. The outcome of this approach is a single neural network policy conditioned on a behavior description that can be precisely modulated. We apply this approach to train a policy on 7,777 human demonstrations for the build-order planning task in StarCraft II. Dimensionality reduction is applied to construct a low-dimensional behavioral space from a high-dimensional description of the army unit composition of each human replay. The results demonstrate that the learned policy can be effectively manipulated to express distinct behaviors. Additionally, by applying the UCB1 algorithm, the policy can adapt its behavior-in-between games-to reach a performance beyond that of the traditional IL baseline approach.

Mots clés

video game imitation learning online adaptation quality diversity

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

CoG_2020___Learning_a_behavioral_repertoire_from_demonstrations_arxiv.pdf (3.45 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Jean-Baptiste Mouret : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02868800

Soumis le : lundi 15 juin 2020-16:03:32

Dernière modification le : jeudi 1 février 2024-10:04:00

Dates et versions

hal-02868800 , version 1 (15-06-2020)

Identifiants

HAL Id : hal-02868800 , version 1

Citer

Niels Justesen, Miguel González-Duque, Daniel Cabarcas, Jean-Baptiste Mouret, Sebastian Risi. Learning a Behavioral Repertoire from Demonstrations. CoG 2020 - IEEE Conference on Games, 2020, Osaka / Virtual, Japan. ⟨hal-02868800⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA UNIV-LORRAINE INRIA2 LORIA LORIA-AIS UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

73 Consultations

206 Téléchargements

Learning a Behavioral Repertoire from Demonstrations

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager