Entropic Value-at-Risk: A New Coherent Risk Measure, Journal of Optimization Theory and Applications, vol.17, issue.3, pp.1105-1123, 2012. ,
DOI : 10.1007/s10957-011-9968-2
Coherent multiperiod risk adjusted values and Bellman???s principle, Annals of Operations Research, vol.10, issue.1, pp.5-22, 2007. ,
DOI : 10.1007/s10479-006-0132-6
Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2003. ,
DOI : 10.1137/S0097539701398375
Duality Relationships for Entropy-Like Minimization Problems, SIAM Journal on Control and Optimization, vol.29, issue.2, pp.325-338, 1991. ,
DOI : 10.1137/0329017
Optimal Adaptive Policies for Sequential Allocation Problems, Advances in Applied Mathematics, vol.17, issue.2, pp.122-142, 1996. ,
DOI : 10.1006/aama.1996.0007
Kullback-leibler upper confidence bounds for optimal sequential allocation. The Annals of Statistics, 2013. ,
Elements of Information Theory, 1991. ,
Risk-aware decision making and dynamic programming, NIPS Workshop on Model Uncertainty and Risk in RL, 2008. ,
Large Deviations Techniques and Applications, 1998. ,
DOI : 10.1007/978-1-4612-5320-4
Optimal stopping, exponential utility, and linear programming, Mathematical Programming, pp.228-244, 1979. ,
DOI : 10.1007/BF01582110
Risk-Sensitive Online Learning, Proceedings of the 17th international conference on Algorithmic Learning Theory, 2006. ,
DOI : 10.1007/11894841_18
The KL-UCB algorithm for bounded stochastic bandits and beyond, Proceedings of the 24th Annual Conference on Learning Theory, 2011. ,
Vraisemblance empirique généralisée et estimation semiparamétrique, 2006. ,
An asymptotically optimal bandit algorithm for bounded support models, Proceedings of the 23rd Annual Conference on Learning Theory, 2010. ,
An asymptotically optimal policy for finite support models in the multiarmed bandit problem, Machine Learning, vol.28, issue.3, pp.361-391, 2011. ,
DOI : 10.1007/s10994-011-5257-4
Finite-time regret bound of a bandit algorithm for the semi-bounded support model, 2012. ,
Risk-sensitive markov decision processes, Management Science, vol.18, pp.356-369, 1972. ,
Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis, Proceedings of the Algorithmic Learning Theory conference, pp.199-213, 2012. ,
DOI : 10.1007/978-3-642-34106-9_18
URL : https://hal.archives-ouvertes.fr/hal-00830033
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
DOI : 10.1016/0196-8858(85)90002-8
An exact algorithm for solving mdps under risksensitive planning objectives with one-switch utility functions, pp.453-460, 2008. ,
A finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences, Proceedings of the 23rd Annual Conference on Learning Theory, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00574987
PORTFOLIO SELECTION*, The Journal of Finance, vol.7, issue.1, pp.77-91, 1952. ,
DOI : 10.1111/j.1540-6261.1952.tb01525.x
Theory of games and economic behavior, 1947. ,
On terminating markov decision processes with a risk-averse objective function, Automatica, vol.37, issue.9, pp.1379-1386, 2001. ,
Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952. ,
DOI : 10.1090/S0002-9904-1952-09620-8
Coherent Approaches to Risk in Optimization Under Uncertainty, Tutorials in operation Research, pp.38-61, 2007. ,
DOI : 10.1287/educ.1073.0032
Robustness of stochastic bandit policies, Theoretical Computer Science, vol.519, 2012. ,
DOI : 10.1016/j.tcs.2013.09.019
Risk-aversion in multi-armed bandits, Proceedings of Advancezs in neural information processing system, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00772609
On the Likelihood that One Unknown Probability Exceeds Another in View of the Evidence of Two Samples, Biometrika, vol.25, issue.3/4, pp.285-294, 1933. ,
DOI : 10.2307/2332286
On the Theory of Apportionment, American Journal of Mathematics, vol.57, issue.2, pp.450-456, 1935. ,
DOI : 10.2307/2371219
Online variance minimization, Proceedings of the 19th Annual Conference on Learning Theory, pp.514-528, 2006. ,