Sharing Information in Adversarial Bandit

David L. Saint-Pierre 1, 2 Olivier Teytaud 2, 1
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : 2-Player games in general provide a popular platform for research in Artificial Intelligence (AI). One of the main challenges coming from this plat-form is approximating a Nash Equilibrium (NE) over zero-sum matrix games. While the problem of computing such a Nash Equilibrium is solvable in polyno-mial time using Linear Programming (LP), it rapidly becomes infeasible to solve as the size of the matrix grows; a situation commonly encountered in games. This paper focuses on improving the approximation of a NE for matrix games such that it outperforms the state-of-the-art algorithms given a finite (and rather small) number T of oracle requests to rewards. To reach this objective, we pro-pose to share information between the different relevant pure strategies. We show both theoretically by improving the bound and empirically by experiments on ar-tificial matrices and on a real-world game that information sharing leads to an improvement of the approximation of the NE.
Type de document :
Communication dans un congrès
EvoGames 2014, Apr 2014, Granada, Spain. 2014, proceedings of EvoStar 2014. 〈10.1007/978-3-662-45523-4_32〉
Liste complète des métadonnées

Littérature citée [12 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01116716
Contributeur : Olivier Teytaud <>
Soumis le : mardi 17 février 2015 - 09:23:50
Dernière modification le : jeudi 5 avril 2018 - 12:30:12
Document(s) archivé(s) le : lundi 18 mai 2015 - 10:05:56

Fichier

sharinginfo (1).pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

David L. Saint-Pierre, Olivier Teytaud. Sharing Information in Adversarial Bandit. EvoGames 2014, Apr 2014, Granada, Spain. 2014, proceedings of EvoStar 2014. 〈10.1007/978-3-662-45523-4_32〉. 〈hal-01116716〉

Partager

Métriques

Consultations de la notice

314

Téléchargements de fichiers

100