Cheap Bandits

Manjesh Kumar Hanawal Hanawal; Venkatesh Saligrama; Michal Valko; Rémi Munos

Communication Dans Un Congrès Année : 2015

Cheap Bandits

(1) , (1) , (2) , (2)

1
2

Manjesh Kumar Hanawal Hanawal

Fonction : Auteur

Department of Electrical and Computer Engineering [Boston University]

Venkatesh Saligrama

Fonction : Auteur
PersonId : 884871

Department of Electrical and Computer Engineering [Boston University]

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

Sequential Learning

Rémi Munos

Fonction : Auteur
PersonId : 836863

Sequential Learning

Résumé

We consider stochastic sequential learning problems where the learner can observe the average reward of several actions. Such a setting is interesting in many applications involving monitoring and surveillance, where the set of the actions to observe represent some (geographical) area. The importance of this setting is that in these applications , it is actually cheaper to observe average reward of a group of actions rather than the reward of a single action. We show that when the reward is smooth over a given graph representing the neighboring actions, we can maximize the cumulative reward of learning while minimizing the sensing cost. In this paper we propose CheapUCB, an algorithm that matches the regret guarantees of the known algorithms for this setting and at the same time guarantees a linear cost again over them. As a by-product of our analysis , we establish a ⌦(p dT) lower bound on the cumulative regret of spectral bandits for a class of graphs with effective dimension d.

Domaines

Machine Learning [stat.ML] Recherche d'information [cs.IR]

Fichier principal

hanawal2015cheap.pdf (2.05 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Michal Valko : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01153540

Soumis le : mardi 19 mai 2015-23:33:08

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : jeudi 20 avril 2017-04:23:57

Dates et versions

hal-01153540 , version 1 (19-05-2015)

Identifiants

HAL Id : hal-01153540 , version 1

Citer

Manjesh Kumar Hanawal Hanawal, Venkatesh Saligrama, Michal Valko, Rémi Munos. Cheap Bandits. International Conference on Machine Learning, 2015, Lille, France. ⟨hal-01153540⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE ANR

1602 Consultations

114 Téléchargements

Cheap Bandits

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager