Online Sparse bandit for Card Games

David Saint-Pierre; Quentin Louveaux; Olivier Teytaud

Communication Dans Un Congrès Année : 2012

Online Sparse bandit for Card Games

(1, 2, 3) , (1) , (2, 3)

1
2
3

David Saint-Pierre

Fonction : Auteur

Department of Electrical Engineering and Computer Science

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Quentin Louveaux

Fonction : Auteur

Department of Electrical Engineering and Computer Science

Olivier Teytaud

Fonction : Auteur

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Résumé

Finding an approximation of a Nash equilibria in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash equilibrium is EXP3 [?]. However, the solution to many problems is often sparse, yet EXP3 inherently fails to exploit this property. To the knowledge of the authors, there exist only an offline truncation pro-posed by [?] to tackle such issue. In this paper, we propose a variation of EXP3 to exploit the fact that solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions. We apply the resulting algorithm to a MCTS program for the Urban Rivals card game.

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

onlinesparse.pdf (274.46 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Olivier Teytaud : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01116714

Soumis le : mardi 17 février 2015-09:26:32

Dernière modification le : lundi 12 février 2024-09:48:04

Archivage à long terme le : lundi 18 mai 2015-10:05:46

Dates et versions

hal-01116714 , version 1 (17-02-2015)

Identifiants

HAL Id : hal-01116714 , version 1

Citer

David Saint-Pierre, Quentin Louveaux, Olivier Teytaud. Online Sparse bandit for Card Games. Advances in Computer Games, Nov 2011, Tilburg, France. ⟨hal-01116714⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA INSMI UMR8623 INRIA2 LRI-AO TDS-MACS UNIV-PARIS-SACLAY

163 Consultations

215 Téléchargements

Online Sparse bandit for Card Games

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager