Risk-Aversion in Multi-armed Bandits

Amir Sani; Alessandro Lazaric; Rémi Munos

Rapport (Rapport De Recherche) Année : 2012

Risk-Aversion in Multi-armed Bandits

(1) , (1) , (1)

Amir Sani

Fonction : Auteur
PersonId : 8209
IdHAL : amirsani
IdRef : 188701648

Sequential Learning

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Sequential Learning

Rémi Munos

Fonction : Auteur
PersonId : 836863

Sequential Learning

Résumé

Stochastic multi--armed bandits solve the Exploration--Exploitation dilemma and ultimately maximize the expected reward. Nonetheless, in many practical problems, maximizing the expected reward is not the most desirable objective. In this paper, we introduce a novel setting based on the principle of risk--aversion where the objective is to compete against the arm with the best risk--return trade--off. This setting proves to be intrinsically more difficult than the standard multi-arm bandit setting due in part to an exploration risk which introduces a regret associated to the variability of an algorithm. Using variance as a measure of risk, we introduce two new algorithms, investigate their theoretical guarantees, and report preliminary empirical results.

Domaines

Apprentissage [cs.LG]

Fichier principal

risk-bandit.pdf (631.98 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alessandro Lazaric : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00750298

Soumis le : mercredi 9 janvier 2013-19:01:48

Dernière modification le : vendredi 24 mars 2023-14:52:56

Archivage à long terme le : vendredi 31 mars 2017-15:52:58

Dates et versions

hal-00750298 , version 1 (09-01-2013)

Identifiants

HAL Id : hal-00750298 , version 1
ARXIV : 1301.1936

Citer

Amir Sani, Alessandro Lazaric, Rémi Munos. Risk-Aversion in Multi-armed Bandits. [Research Report] 2012. ⟨hal-00750298⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LILLE3 CNRS INRIA LAGIS INRIA2 LARA

241 Consultations

260 Téléchargements

Risk-Aversion in Multi-armed Bandits

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager