HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Anytime many-armed bandits

Olivier Teytaud 1 Sylvain Gelly 1 Michèle Sebag 1
1 TANC - Algorithmic number theory for cryptology
Inria Saclay - Ile de France, LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau]
Abstract : This paper introduces the many-armed bandit problem (ManAB), where the number of arms is large comparatively to the relevant number of time steps. While the ManAB framework is relevant to many real-world applications, the state of the art does not offer anytime algorithms handling ManAB problems. Both theory and practice suggest that two problem categories must be distinguished; the easy category includes those problems where good arms have reward probability close to 1; the difficult category includes other problems. Two algorithms termed FAILURE and MUCBT are proposed for the ManAB framework. FAILURE and its variants extend the non-anytime approach proposed for the denumerable-armed bandit and non-asymptotic bounds are shown; it works very efficiently for easy ManAB problems. Meanwhile, MUCBT efficiently deals with difficult ManAB problems.
Document type :
Conference papers
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download

Contributor : Olivier Teytaud Connect in order to contact the contributor
Submitted on : Wednesday, September 19, 2007 - 2:20:51 PM
Last modification on : Friday, February 4, 2022 - 3:21:34 AM
Long-term archiving on: : Thursday, April 8, 2010 - 7:46:10 PM


Files produced by the author(s)


  • HAL Id : inria-00173263, version 1



Olivier Teytaud, Sylvain Gelly, Michèle Sebag. Anytime many-armed bandits. CAP07, 2007, Grenoble, France. ⟨inria-00173263⟩



Record views


Files downloads