Skip to Main content Skip to Navigation
Other publications

Robust Risk-averse Stochastic Multi-Armed Bandits

Abstract : We study a variant of the standard stochastic multi-armed bandit problem when one is not interested in the arm with the best mean, but instead in the arm maximizing some coherent risk measure criterion. Further, we are studying the deviations of the regret instead of the less informative expected regret. We provide an algorithm, called RA-UCB to solve this problem, together with a high probability bound on its regret.
Document type :
Other publications
Complete list of metadata

Cited literature [32 references]  Display  Hide  Download

https://hal.inria.fr/hal-00821670
Contributor : Odalric-Ambrym Maillard <>
Submitted on : Saturday, May 11, 2013 - 4:25:06 PM
Last modification on : Sunday, December 31, 2017 - 9:44:02 AM
Long-term archiving on: : Monday, August 19, 2013 - 3:45:24 PM

File

RiskAwareKLMAB_Arxiv.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00821670, version 1

Citation

Odalric-Ambrym Maillard. Robust Risk-averse Stochastic Multi-Armed Bandits. 2013. ⟨hal-00821670⟩

Share

Metrics

Record views

553

Files downloads

1414