Anytime many-armed bandits

Olivier Teytaud 1 Sylvain Gelly 1 Michèle Sebag 1
1 TANC - Algorithmic number theory for cryptology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France, X - École polytechnique, CNRS - Centre National de la Recherche Scientifique : UMR7161
Abstract : This paper introduces the many-armed bandit problem (ManAB), where the number of arms is large comparatively to the relevant number of time steps. While the ManAB framework is relevant to many real-world applications, the state of the art does not offer anytime algorithms handling ManAB problems. Both theory and practice suggest that two problem categories must be distinguished; the easy category includes those problems where good arms have reward probability close to 1; the difficult category includes other problems. Two algorithms termed FAILURE and MUCBT are proposed for the ManAB framework. FAILURE and its variants extend the non-anytime approach proposed for the denumerable-armed bandit and non-asymptotic bounds are shown; it works very efficiently for easy ManAB problems. Meanwhile, MUCBT efficiently deals with difficult ManAB problems.
Type de document :
Communication dans un congrès
CAP07, 2007, Grenoble, France. 2007
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00173263
Contributeur : Olivier Teytaud <>
Soumis le : mercredi 19 septembre 2007 - 14:20:51
Dernière modification le : jeudi 10 mai 2018 - 02:06:28
Document(s) archivé(s) le : jeudi 8 avril 2010 - 19:46:10

Fichier

mabcap2.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00173263, version 1

Collections

Citation

Olivier Teytaud, Sylvain Gelly, Michèle Sebag. Anytime many-armed bandits. CAP07, 2007, Grenoble, France. 2007. 〈inria-00173263〉

Partager

Métriques

Consultations de la notice

348

Téléchargements de fichiers

244