Skip to Main content Skip to Navigation
Conference papers

Stochastic bandits with arm-dependent delays

Abstract : Significant work has been recently dedicated to the stochastic delayed bandits because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PatientBandits. We provide both problems-dependent and problems-independent bounds on the regret as well as performance lower bounds.
Document type :
Conference papers
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.inria.fr/hal-02950116
Contributor : Michal Valko <>
Submitted on : Sunday, September 27, 2020 - 11:39:50 AM
Last modification on : Wednesday, October 14, 2020 - 4:13:50 AM

File

manegueu2020stochastic.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02950116, version 1

Citation

Anne Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko. Stochastic bandits with arm-dependent delays. International Conference on Machine Learning, 2020, Vienna, Austria. ⟨hal-02950116⟩

Share

Metrics

Record views

12

Files downloads

112