Latent Bandits. - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Autre Publication Année : 2014

Latent Bandits.

Résumé

We consider a multi-armed bandit problem where the reward distributions are indexed by two sets --one for arms, one for type-- and can be partitioned into a small number of clusters according to the type. First, we consider the setting where all reward distributions are known and all types have the same underlying cluster, the type's identity is, however, unknown. Second, we study the case where types may come from different classes, which is significantly more challenging. Finally, we tackle the case where the reward distributions are completely unknown. In each setting, we introduce specific algorithms and derive non-trivial regret performance. Numerical experiments show that, in the most challenging agnostic case, the proposed algorithm achieves excellent performance in several difficult scenarios.
Fichier principal
Vignette du fichier
icml_cr_Arxiv.pdf (407.49 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00926281 , version 1 (09-01-2014)

Identifiants

  • HAL Id : hal-00926281 , version 1

Citer

Odalric-Ambrym Maillard, Shie Mannor. Latent Bandits.. 2014. ⟨hal-00926281⟩
314 Consultations
363 Téléchargements

Partager

Gmail Facebook X LinkedIn More