Skip to Main content Skip to Navigation
New interface
Other publications

Robust Risk-averse Stochastic Multi-Armed Bandits

Abstract : We study a variant of the standard stochastic multi-armed bandit problem when one is not interested in the arm with the best mean, but instead in the arm maximizing some coherent risk measure criterion. Further, we are studying the deviations of the regret instead of the less informative expected regret. We provide an algorithm, called RA-UCB to solve this problem, together with a high probability bound on its regret.
Document type :
Other publications
Complete list of metadata

Cited literature [32 references]  Display  Hide  Download
Contributor : Odalric-Ambrym Maillard Connect in order to contact the contributor
Submitted on : Saturday, May 11, 2013 - 4:25:06 PM
Last modification on : Thursday, January 6, 2022 - 2:50:02 PM
Long-term archiving on: : Monday, August 19, 2013 - 3:45:24 PM


Files produced by the author(s)


  • HAL Id : hal-00821670, version 1


Odalric-Ambrym Maillard. Robust Risk-averse Stochastic Multi-Armed Bandits. 2013. ⟨hal-00821670⟩



Record views


Files downloads