Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

Julien Pérolat 1, 2 Florian Strub 2 Bilal Piot 3 Olivier Pietquin 1, 4, 3, 2
2 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : This paper addresses the problem of learning a Nash equilibrium in γ-discounted mul-tiplayer general-sum Markov Games (MGs) in a batch setting. As the number of players increases in MG, the agents may either collaborate or team apart to increase their final rewards. One solution to address this problem is to look for a Nash equilibrium. Although , several techniques were found for the subcase of two-player zero-sum MGs, those techniques fail to find a Nash equilibrium in general-sum Markov Games. In this paper, we introduce a new definition of-Nash equilibrium in MGs which grasps the strategy's quality for multiplayer games. We prove that minimizing the norm of two Bellman-like residuals implies to learn such an-Nash equilibrium. Then, we show that minimizing an empirical estimate of the L p norm of these Bellman-like residuals allows learning for general-sum games within the batch setting. Finally, we introduce a neural network architecture that successfully learns a Nash equilibrium in generic multiplayer general-sum turn-based MGs.
Type de document :
Communication dans un congrès
AISTATS 2017 - The 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States. pp.1-14
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01648489
Contributeur : Julien Pérolat <>
Soumis le : dimanche 26 novembre 2017 - 13:37:11
Dernière modification le : mardi 3 juillet 2018 - 11:36:02

Fichier

bellman-residual-aistats2016(5...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01648489, version 1

Citation

Julien Pérolat, Florian Strub, Bilal Piot, Olivier Pietquin. Learning Nash Equilibrium for General-Sum Markov Games from Batch Data. AISTATS 2017 - The 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States. pp.1-14. 〈hal-01648489〉

Partager

Métriques

Consultations de la notice

158

Téléchargements de fichiers

64