On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

Résumé

We lay the foundations of a non-parametric theory of best-arm identification in multi-armed bandits with a fixed budget T. We consider general, possibly non-parametric, models D for distributions over the arms; an overarching example is the model D = P(0,1) of all probability distributions over [0,1]. We propose upper bounds on the average log-probability of misidentifying the optimal arm based on information-theoretic quantities that correspond to infima over Kullback-Leibler divergences between some distributions in D and a given distribution. This is made possible by a refined analysis of the successive-rejects strategy of Audibert, Bubeck, and Munos (2010). We finally provide lower bounds on the same average log-probability, also in terms of the same new information-theoretic quantities; these lower bounds are larger when the (natural) assumptions on the considered strategies are stronger. All these new upper and lower bounds generalize existing bounds based, e.g., on gaps between distributions.
Fichier principal
Vignette du fichier
barrier23.pdf (496.17 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03792668 , version 1 (30-09-2022)
hal-03792668 , version 2 (31-01-2023)

Identifiants

Citer

Antoine Barrier, Aurélien Garivier, Gilles Stoltz. On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits. ALT 2023 - The 34th International Conference on Algorithmic Learning Theory, Feb 2023, Singapour, Singapore. ⟨hal-03792668v2⟩
199 Consultations
121 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More