Risk-Aversion in Multi-armed Bandits

Amir Sani; Alessandro Lazaric; Rémi Munos

Communication Dans Un Congrès Année : 2012

Risk-Aversion in Multi-armed Bandits

(1) , (1) , (1)

Amir Sani

Fonction : Auteur
PersonId : 8209
IdHAL : amirsani
IdRef : 188701648

Sequential Learning

Alessandro Lazaric

Fonction : Auteur
PersonId : 851
IdHAL : alessandro-lazaric
ORCID : 0000-0002-8970-413X
IdRef : 188701486

Sequential Learning

Rémi Munos

Fonction : Auteur
PersonId : 836863

Sequential Learning

Résumé

Stochastic multi--armed bandits solve the Exploration--Exploitation dilemma and ultimately maximize the expected reward. Nonetheless, in many practical problems, maximizing the expected reward is not the most desirable objective. In this paper, we introduce a novel setting based on the principle of risk--aversion where the objective is to compete against the arm with the best risk--return trade--off. This setting proves to be more difficult than the standard multi-arm bandit setting due in part to an exploration risk which introduces a regret associated to the variability of an algorithm. Using variance as a measure of risk, we define two algorithms, investigate their theoretical guarantees, and report preliminary empirical results.

Domaines

Machine Learning [stat.ML]

Fichier principal

risk-bandit-cr.pdf (283.1 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alessandro Lazaric : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00772609

Soumis le : jeudi 10 janvier 2013-18:09:37

Dernière modification le : jeudi 15 février 2024-03:31:38

Archivage à long terme le : samedi 1 avril 2017-03:40:10

Dates et versions

hal-00772609 , version 1 (10-01-2013)

Identifiants

HAL Id : hal-00772609 , version 1

Citer

Amir Sani, Alessandro Lazaric, Rémi Munos. Risk-Aversion in Multi-armed Bandits. NIPS - Twenty-Sixth Annual Conference on Neural Information Processing Systems, Dec 2012, Lake Tahoe, United States. ⟨hal-00772609⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UNIV-LILLE3 CNRS INRIA IRISA LAGIS INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

335 Consultations

162 Téléchargements

Risk-Aversion in Multi-armed Bandits

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager