Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

Julien Pérolat; Florian Strub; Bilal Piot; Olivier Pietquin

Communication Dans Un Congrès Année : 2017

Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

(1, 2) , (2) , (3) , (1, 4, 3, 2)

1
2
3
4

Julien Pérolat

Fonction : Auteur
PersonId : 14403
IdHAL : julien-perolat
IdRef : 137378181

Université de Lille, Sciences et Technologies

Sequential Learning

Florian Strub

Fonction : Auteur
PersonId : 18649
IdHAL : florian-strub
ORCID : 0000-0001-7271-5345

Sequential Learning

Bilal Piot

Fonction : Auteur

DeepMind [London]

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Université de Lille, Sciences et Technologies

Institut universitaire de France

DeepMind [London]

Sequential Learning

Résumé

This paper addresses the problem of learning a Nash equilibrium in γ-discounted mul-tiplayer general-sum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although , several techniques were found for the subcase of two-player zero-sum MGs, those techniques fail to find a Nash equilibrium in general-sum Markov Games. In this paper, we introduce a new definition of-Nash equilibrium in MGs which grasps the strategy's quality for multiplayer games. We prove that minimizing the norm of two Bellman-like residuals implies to learn such an-Nash equilibrium. Then, we show that minimizing an empirical estimate of the L p norm of these Bellman-like residuals allows learning for general-sum games within the batch setting. Finally, we introduce a neural network architecture that successfully learns a Nash equilibrium in generic multiplayer general-sum turn-based MGs.

Domaines

Informatique [cs] Intelligence artificielle [cs.AI]

Fichier principal

bellman-residual-aistats2016(5).pdf (909.96 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Julien Pérolat : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01648489

Soumis le : dimanche 26 novembre 2017-13:37:11

Dernière modification le : lundi 15 avril 2024-11:25:23

Dates et versions

hal-01648489 , version 1 (26-11-2017)

Identifiants

HAL Id : hal-01648489 , version 1

Citer

Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin. Learning Nash Equilibrium for General-Sum Markov Games from Batch Data. AISTATS 2017 - The 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States. pp.1-14. ⟨hal-01648489⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

166 Consultations

284 Téléchargements

Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager