Best of both worlds: Stochastic & adversarial best-arm identification

Yasin Abbasi-Yadkori; Peter Bartlett; Victor Gabillon; Alan Malek; Michal Valko

Communication Dans Un Congrès Année : 2018

Best of both worlds: Stochastic & adversarial best-arm identification

(1) , (2) , (3) , (4) , (5)

1
2
3
4
5

Yasin Abbasi-Yadkori

Fonction : Auteur

Adobe Research

Peter Bartlett

Fonction : Auteur

Lawrence Berkeley National Laboratory [Berkeley]

Victor Gabillon

Fonction : Auteur
PersonId : 900485

Queensland University of Technology [Brisbane]

Alan Malek

Fonction : Auteur
PersonId : 1032825

Massachusetts Institute of Technology

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

Sequential Learning

Résumé

We study bandit best-arm identification with arbitrary and potentially adversarial rewards. A simple random uniform learner obtains the optimal rate of error in the adversarial scenario. However, this type of strategy is suboptimal when the rewards are sampled stochastically. Therefore, we ask: Can we design a learner that performs optimally in both the stochastic and adversarial problems while not being aware of the nature of the rewards? First, we show that designing such a learner is impossible in general. In particular, to be robust to adversarial rewards, we can only guarantee optimal rates of error on a subset of the stochastic problems. We give a lower bound that characterizes the optimal rate in stochastic problems if the strategy is constrained to be robust to adversarial rewards. Finally, we design a simple parameter-free algorithm and show that its probability of error matches (up to log factors) the lower bound in stochastic problems, and it is also robust to adversarial ones.

Mots clés

adversarial and stochastic rewards multi-armed bandits best-arm identification

Domaines

Machine Learning [stat.ML]

Fichier principal

Main.pdf (1.11 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Victor Gabillon : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01808948

Soumis le : lundi 23 juillet 2018-23:13:48

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : mercredi 24 octobre 2018-15:27:40

Dates et versions

hal-01808948 , version 1 (06-06-2018)

hal-01808948 , version 2 (12-07-2018)

hal-01808948 , version 3 (16-07-2018)

hal-01808948 , version 4 (23-07-2018)

hal-01808948 , version 5 (19-07-2021)

hal-01808948 , version 6 (31-07-2023)

Identifiants

HAL Id : hal-01808948 , version 4

Citer

Yasin Abbasi-Yadkori, Peter Bartlett, Victor Gabillon, Alan Malek, Michal Valko. Best of both worlds: Stochastic & adversarial best-arm identification. Conference on Learning Theory, 2018, Stockholm, Sweden. ⟨hal-01808948v4⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

558 Consultations

939 Téléchargements

Best of both worlds: Stochastic & adversarial best-arm identification

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager