Sharing Information in Adversarial Bandit

David L. Saint-Pierre 1, 2 Olivier Teytaud 2, 1
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : 2-Player games in general provide a popular platform for research in Artificial Intelligence (AI). One of the main challenges coming from this plat-form is approximating a Nash Equilibrium (NE) over zero-sum matrix games. While the problem of computing such a Nash Equilibrium is solvable in polyno-mial time using Linear Programming (LP), it rapidly becomes infeasible to solve as the size of the matrix grows; a situation commonly encountered in games. This paper focuses on improving the approximation of a NE for matrix games such that it outperforms the state-of-the-art algorithms given a finite (and rather small) number T of oracle requests to rewards. To reach this objective, we pro-pose to share information between the different relevant pure strategies. We show both theoretically by improving the bound and empirically by experiments on ar-tificial matrices and on a real-world game that information sharing leads to an improvement of the approximation of the NE.
Document type :
Conference papers
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://hal.inria.fr/hal-01116716
Contributor : Olivier Teytaud <>
Submitted on : Tuesday, February 17, 2015 - 9:23:50 AM
Last modification on : Thursday, April 5, 2018 - 12:30:12 PM
Long-term archiving on : Monday, May 18, 2015 - 10:05:56 AM

File

sharinginfo (1).pdf
Files produced by the author(s)

Identifiers

Collections

Citation

David L. Saint-Pierre, Olivier Teytaud. Sharing Information in Adversarial Bandit. EvoGames 2014, Apr 2014, Granada, Spain. ⟨10.1007/978-3-662-45523-4_32⟩. ⟨hal-01116716⟩

Share

Metrics

Record views

427

Files downloads

232