Online Sparse bandit for Card Games

David Saint-Pierre 1, 2, 3 Quentin Louveaux 1 Olivier Teytaud 2, 3
3 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Finding an approximation of a Nash equilibria in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash equilibrium is EXP3 [?]. However, the solution to many problems is often sparse, yet EXP3 inherently fails to exploit this property. To the knowledge of the authors, there exist only an offline truncation pro-posed by [?] to tackle such issue. In this paper, we propose a variation of EXP3 to exploit the fact that solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions. We apply the resulting algorithm to a MCTS program for the Urban Rivals card game.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/hal-01116714
Contributor : Olivier Teytaud <>
Submitted on : Tuesday, February 17, 2015 - 9:26:32 AM
Last modification on : Thursday, April 5, 2018 - 12:30:12 PM
Long-term archiving on : Monday, May 18, 2015 - 10:05:46 AM

File

onlinesparse.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01116714, version 1

Collections

Citation

David Saint-Pierre, Quentin Louveaux, Olivier Teytaud. Online Sparse bandit for Card Games. Advances in Computer Games, Nov 2011, Tilburg, France. ⟨hal-01116714⟩

Share

Metrics

Record views

311

Files downloads

251