Skip to Main content Skip to Navigation
Conference papers

Anytime many-armed bandits

Olivier Teytaud 1 Sylvain Gelly 1 Michèle Sebag 1
1 TANC - Algorithmic number theory for cryptology
Inria Saclay - Ile de France, LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau]
Abstract : This paper introduces the many-armed bandit problem (ManAB), where the number of arms is large comparatively to the relevant number of time steps. While the ManAB framework is relevant to many real-world applications, the state of the art does not offer anytime algorithms handling ManAB problems. Both theory and practice suggest that two problem categories must be distinguished; the easy category includes those problems where good arms have reward probability close to 1; the difficult category includes other problems. Two algorithms termed FAILURE and MUCBT are proposed for the ManAB framework. FAILURE and its variants extend the non-anytime approach proposed for the denumerable-armed bandit and non-asymptotic bounds are shown; it works very efficiently for easy ManAB problems. Meanwhile, MUCBT efficiently deals with difficult ManAB problems.
Document type :
Conference papers
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Olivier Teytaud <>
Submitted on : Wednesday, September 19, 2007 - 2:20:51 PM
Last modification on : Thursday, November 5, 2020 - 9:00:05 AM
Long-term archiving on: : Thursday, April 8, 2010 - 7:46:10 PM


Files produced by the author(s)


  • HAL Id : inria-00173263, version 1



Olivier Teytaud, Sylvain Gelly, Michèle Sebag. Anytime many-armed bandits. CAP07, 2007, Grenoble, France. ⟨inria-00173263⟩



Record views


Files downloads