Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

A minimax and asymptotically optimal algorithm for stochastic bandits

Abstract : We propose the kl-UCB ++ algorithm for regret minimization in stochastic bandit models with exponential families of distributions. We prove that it is simultaneously asymptotically optimal (in the sense of Lai and Robbins' lower bound) and minimax optimal. This is the first algorithm proved to enjoy these two properties at the same time. This work thus merges two different lines of research, with simple proofs involving no complexity overhead.
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01475078
Contributor : Pierre Ménard <>
Submitted on : Thursday, February 23, 2017 - 2:32:33 PM
Last modification on : Thursday, March 5, 2020 - 3:08:03 PM
Long-term archiving on: : Wednesday, May 24, 2017 - 2:01:58 PM

Files

main.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01475078, version 1
  • ARXIV : 1702.07211

Citation

Pierre Ménard, Aurélien Garivier. A minimax and asymptotically optimal algorithm for stochastic bandits. 2017. ⟨hal-01475078v1⟩

Share

Metrics

Record views

246

Files downloads

166