Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.1-122, 2012. ,
DOI : 10.1561/2200000024
Pure Exploration in Multi-armed Bandits Problems, Proceedings of the 20th ALT, ALT'09, pp.23-37, 2009. ,
DOI : 10.1090/S0002-9904-1952-09620-8
Multiple identifications in multi-armed bandits, Proceedings of The 30th ICML, pp.258-265, 2013. ,
A Survey of Preference-Based Online Learning with Bandit Algorithms, Algorithmic Learning Theory (ALT), pp.18-39 ,
DOI : 10.1007/978-3-319-11662-4_3
Fast boosting using adversarial bandits, International Conference on Machine Learning (ICML), pp.143-150, 2010. ,
URL : https://hal.archives-ouvertes.fr/in2p3-00614564
Kullback???Leibler upper confidence bounds for optimal sequential allocation, The Annals of Statistics, vol.41, issue.3, pp.1516-1541, 2013. ,
DOI : 10.1214/13-AOS1119SUPP
Extreme bandits, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pp.1089-1097, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01079354
Prediction, Learning and Games, 2006. ,
DOI : 10.1017/CBO9780511546921
Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator, The Annals of Mathematical Statistics, vol.27, issue.3, pp.642-669, 1956. ,
DOI : 10.1214/aoms/1177728174
PAC Bounds for Multi-armed Bandit and Markov Decision Processes, Proceedings of the 15th Conference on Learning Theory (COLT), pp.255-270, 2002. ,
DOI : 10.1007/3-540-45435-7_18
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1345
Multi-bandit best arm identification, Advances in NIPS 24, pp.2222-2230, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00632523
Feature selection as a one-player game, Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp.359-366, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00484049
Pac subset selection in stochastic multi-armed bandits, Proceedings of the Twenty-ninth International Conference on Machine Learning, pp.655-662, 2012. ,
Algorithms for multi-armed bandit problems. CoRR, abs/1402, 2014. ,
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
DOI : 10.1016/0196-8858(85)90002-8
URL : http://doi.org/10.1016/0196-8858(85)90002-8
The sample complexity of exploration in the multi-armed bandit problem, Journal of Machine Learning Research, vol.5, pp.623-648, 2004. ,
The tight constant in the dvoretzky-kieferwolfowitz inequality. The Annals of Probability, pp.1269-1283, 1990. ,
Riskaversion in multi-armed bandits, 26th Annual Conference on Neural Information Processing Systems (NIPS), pp.3284-3292, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00772609
An irreverent guide to Value-at-Risk, Financial Engineering News, vol.1, 1997. ,
Sample complexity of risk-averse bandit-arm selection, Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp.2576-2582, 2013. ,
The K-armed dueling bandits problem, Journal of Computer and System Sciences, vol.78, issue.5, pp.1538-1556, 2012. ,
DOI : 10.1016/j.jcss.2011.12.028
Beat the mean bandit, Proceedings of the International Conference on Machine Learning (ICML), pp.241-248, 2011. ,
Optimal pac multiple arm identification with applications to crowdsourcing, Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp.217-225, 2014. ,
Relative upper confidence bound for the k-armed dueling bandit problem, Proceedings of the Thirty-First International Conference on Machine Learning (ICML), pp.10-18, 2014. ,