hal-00747005, version 2
Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
Victor Gabillon
1Mohammad Ghavamzadeh
a, 1Alessandro Lazaric
a, 1
(2012)
Résumé : We study the problem of identifying the best arm(s) in the stochastic multi-armed bandit setting. This problem has been studied in the literature from two different perspectives: fixed budget and fixed confidence. We propose a unifying approach that leads to a meta-algorithm called unified gap-based exploration (UGapE), with a common structure and similar theoretical analysis for these two settings. We prove a performance bound for the two versions of the algorithm showing that the two problems are characterized by the same notion of complexity. We also show how the UGapE algorithm as well as its theoretical analysis can be extended to take into account the variance of the arms and to multiple bandits. Finally, we evaluate the performance of UGapE and compare it with a number of existing fixed budget and fixed confidence algorithms.
- a – INRIA
- 1 : SEQUEL (INRIA Lille - Nord Europe)
- INRIA – CNRS : UMR8146 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales – Ecole Centrale de Lille
- Domaine : Informatique/Apprentissage
- Versions disponibles : v1 (30-10-2012) v2 (15-01-2013)
- hal-00747005, version 2
- http://hal.inria.fr/hal-00747005
- oai:hal.inria.fr:hal-00747005
- Contributeur : Victor Gabillon
- Soumis le : Lundi 14 Janvier 2013, 16:45:54
- Dernière modification le : Mardi 15 Janvier 2013, 13:48:48






Documents associés
Exporter