Skip to Main content Skip to Navigation
Conference papers

Online Sparse bandit for Card Games

David Saint-Pierre 1, 2, 3 Quentin Louveaux 1 Olivier Teytaud 2, 3 
3 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : Finding an approximation of a Nash equilibria in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash equilibrium is EXP3 [?]. However, the solution to many problems is often sparse, yet EXP3 inherently fails to exploit this property. To the knowledge of the authors, there exist only an offline truncation pro-posed by [?] to tackle such issue. In this paper, we propose a variation of EXP3 to exploit the fact that solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions. We apply the resulting algorithm to a MCTS program for the Urban Rivals card game.
Document type :
Conference papers
Complete list of metadata
Contributor : Olivier Teytaud Connect in order to contact the contributor
Submitted on : Tuesday, February 17, 2015 - 9:26:32 AM
Last modification on : Thursday, July 8, 2021 - 3:49:58 AM
Long-term archiving on: : Monday, May 18, 2015 - 10:05:46 AM


Files produced by the author(s)


  • HAL Id : hal-01116714, version 1


David Saint-Pierre, Quentin Louveaux, Olivier Teytaud. Online Sparse bandit for Card Games. Advances in Computer Games, Nov 2011, Tilburg, France. ⟨hal-01116714⟩



Record views


Files downloads