Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Cindy Trinh; Emilie Kaufmann; Claire Vernade; Richard Combes

Preprints, Working Papers, ... Year : 2019

Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

(1) , (2, 3, 4) , (5) , (6, 7)

1
2
3
4
5
6
7

Cindy Trinh

Function : Author

Ecole Normale Supérieure Paris-Saclay

Emilie Kaufmann

Function : Author
PersonId : 10422
IdHAL : emilie-kaufmann
ORCID : 0000-0002-5496-824X
IdRef : 197040810

Centre National de la Recherche Scientifique

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Sequential Learning

Claire Vernade

Function : Author
PersonId : 970871

DeepMind [London]

Richard Combes

Function : Author
PersonId : 14877
IdHAL : richard-combes
ORCID : 0000-0003-3954-7241
IdRef : 171607732

Laboratoire des signaux et systèmes

CentraleSupélec

Abstract

Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple framework for regret minimization problems over rank-one matrices of arms. The initially proposed algorithms are proved to have logarithmic regret, but do not match the existing lower bound for this problem. We close this gap by first proving that rank-one bandits are a particular instance of unimodal bandits, and then providing a new analysis of Unimodal Thompson Sampling (UTS), initially proposed by Paladino et al (2017). We prove an asymptotically optimal regret bound on the frequentist regret of UTS and we support our claims with simulations showing the significant improvement of our method compared to the state-of-the-art.

Keywords

unimodal bandits rank-one bandits Multi-armed bandits

Domains

Machine Learning [stat.ML]

Fichier principal

HaL.pdf (1.14 Mo)

8-8-all.jpg (134.82 Ko)

Origin : Files produced by the author(s)

Emilie Kaufmann : Connect in order to contact the contributor

https://hal.science/hal-02396943

Submitted on : Friday, December 6, 2019-11:50:44 AM

Last modification on : Friday, May 31, 2024-6:32:03 PM

Long-term archiving on: Saturday, March 7, 2020-3:58:08 PM

Dates and versions

hal-02396943 , version 1 (06-12-2019)

hal-02396943 , version 2 (17-02-2020)

Identifiers

HAL Id : hal-02396943 , version 1
ARXIV : 1912.03074

Cite

Cindy Trinh, Emilie Kaufmann, Claire Vernade, Richard Combes. Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling. 2019. ⟨hal-02396943v1⟩

Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Altmetric

Share