A minimax and asymptotically optimal algorithm for stochastic bandits

Pierre Ménard; Aurélien Garivier

Pré-Publication, Document De Travail Année : 2017

A minimax and asymptotically optimal algorithm for stochastic bandits

(1) , (1)

Pierre Ménard

Fonction : Auteur
PersonId : 1022182

Institut de Mathématiques de Toulouse UMR5219

Aurélien Garivier

Fonction : Auteur
PersonId : 4986
IdHAL : aurelien-garivier
ORCID : 0000-0002-4906-9573
IdRef : 111902495

Institut de Mathématiques de Toulouse UMR5219

Résumé

We propose the kl-UCB ++ algorithm for regret minimization in stochastic bandit models with exponential families of distributions. We prove that it is simultaneously asymptotically optimal (in the sense of Lai and Robbins' lower bound) and minimax optimal. This is the first algorithm proved to enjoy these two properties at the same time. This work thus merges two different lines of research, with simple proofs involving no complexity overhead.

Mots clés

Stochastic multi-armed bandits Regret analysis Upper confidence bound (UCB) Minimax optimality Asymptotic optimality

Domaines

Statistiques [math.ST] Théorie [stat.TH] Machine Learning [stat.ML] Probabilités [math.PR]

Fichier principal

main.pdf (171.35 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Pierre Ménard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01475078

Soumis le : jeudi 23 février 2017-14:32:33

Dernière modification le : lundi 20 novembre 2023-11:44:19

Archivage à long terme le : mercredi 24 mai 2017-14:01:58

Dates et versions

hal-01475078 , version 1 (23-02-2017)

hal-01475078 , version 2 (19-09-2017)

Identifiants

HAL Id : hal-01475078 , version 1
ARXIV : 1702.07211

Citer

Pierre Ménard, Aurélien Garivier. A minimax and asymptotically optimal algorithm for stochastic bandits. 2017. ⟨hal-01475078v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

549 Consultations

344 Téléchargements

A minimax and asymptotically optimal algorithm for stochastic bandits

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager