Robust Risk-averse Stochastic Multi-Armed Bandits

Odalric-Ambrym Maillard

Autre Publication Année : 2013

Robust Risk-averse Stochastic Multi-Armed Bandits

(1)

Odalric-Ambrym Maillard

Fonction : Auteur
PersonId : 5563
IdHAL : odalric-ambrym-maillard
ORCID : 0000-0001-7935-7026
IdRef : 158055594

Department of Electrical Engineering - Technion [Haïfa]

Résumé

We study a variant of the standard stochastic multi-armed bandit problem when one is not interested in the arm with the best mean, but instead in the arm maximizing some coherent risk measure criterion. Further, we are studying the deviations of the regret instead of the less informative expected regret. We provide an algorithm, called RA-UCB to solve this problem, together with a high probability bound on its regret.

Domaines

Apprentissage [cs.LG]

Fichier principal

RiskAwareKLMAB_Arxiv.pdf (256.54 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Odalric-Ambrym Maillard : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00821670

Soumis le : samedi 11 mai 2013-16:25:06

Dernière modification le : jeudi 6 janvier 2022-14:50:02

Archivage à long terme le : lundi 19 août 2013-15:45:24

Dates et versions

hal-00821670 , version 1 (11-05-2013)

Identifiants

HAL Id : hal-00821670 , version 1

Citer

Odalric-Ambrym Maillard. Robust Risk-averse Stochastic Multi-Armed Bandits. 2013. ⟨hal-00821670⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

263 Consultations

1207 Téléchargements

Robust Risk-averse Stochastic Multi-Armed Bandits

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager