Stochastic bandits with arm-dependent delays

Anne Gael Manegueu; Claire Vernade; Alexandra Carpentier; Michal Valko

Communication Dans Un Congrès Année : 2020

Stochastic bandits with arm-dependent delays

(1) , (2) , (1) , (2)

1
2

Anne Gael Manegueu

Fonction : Auteur

Otto-von-Guericke-Universität Magdeburg = Otto-von-Guericke University [Magdeburg]

Claire Vernade

Fonction : Auteur

DeepMind [London]

Alexandra Carpentier

Fonction : Auteur

Otto-von-Guericke-Universität Magdeburg = Otto-von-Guericke University [Magdeburg]

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

DeepMind [London]

Résumé

Significant work has been recently dedicated to the stochastic delayed bandits because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PatientBandits. We provide both problems-dependent and problems-independent bounds on the regret as well as performance lower bounds.

Domaines

Machine Learning [stat.ML]

Fichier principal

manegueu2020stochastic.pdf (1.93 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Michal Valko : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02950116

Soumis le : dimanche 27 septembre 2020-11:39:50

Dernière modification le : lundi 30 mai 2022-17:36:02

Archivage à long terme le : jeudi 3 décembre 2020-18:35:53

Dates et versions

hal-02950116 , version 1 (27-09-2020)

Identifiants

HAL Id : hal-02950116 , version 1

Citer

Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko. Stochastic bandits with arm-dependent delays. International Conference on Machine Learning, 2020, Vienna, Austria. ⟨hal-02950116⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

16 Consultations

66 Téléchargements

Stochastic bandits with arm-dependent delays

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager