Linear Thompson Sampling Revisited

Marc Abeille; Alessandro Lazaric

Communication Dans Un Congrès Année : 2017

Linear Thompson Sampling Revisited

(1, 2) , (1, 2)

1
2

Marc Abeille

Fonction : Auteur
PersonId : 1004466

Sequential Learning

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Sequential Learning

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Résumé

We derive an alternative proof for the regret of Thompson sampling (\ts) in the stochastic linear bandit setting. While we obtain a regret bound of order $\wt{O}(d^{3/2}\sqrt{T})$ as in previous results, the proof sheds new light on the functioning of the \ts. We leverage on the structure of the problem to show how the regret is related to the sensitivity (i.e., the gradient) of the objective function and how selecting optimal arms associated to \textit{optimistic} parameters does control it. Thus we show that \ts can be seen as a generic randomized algorithm where the sampling distribution is designed to have a fixed probability of being optimistic, at the cost of an additional $\sqrt{d}$ regret factor compared to a UCB-like approach. Furthermore, we show that our proof can be readily applied to regularized linear optimization and generalized linear model problems.

Domaines

Machine Learning [stat.ML]

Fichier principal

main.pdf (782.06 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alessandro Lazaric : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01493561

Soumis le : mardi 21 mars 2017-17:37:20

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : jeudi 22 juin 2017-14:16:20

Dates et versions

hal-01493561 , version 1 (21-03-2017)

Identifiants

HAL Id : hal-01493561 , version 1

Citer

Marc Abeille, Alessandro Lazaric. Linear Thompson Sampling Revisited. AISTATS 2017 - 20th International Conference on Artificial Intelligence and Statistics, Apr 2017, Fort Lauderdale, United States. ⟨hal-01493561⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE ANR

369 Consultations

174 Téléchargements

Linear Thompson Sampling Revisited

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager