P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

S. Bubeck and N. Cesa-bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.1-122, 2012.
DOI : 10.1561/2200000024

S. Bubeck, R. Munos, and G. Stoltz, Pure Exploration in Multi-armed Bandits Problems, Proceedings of the 20th ALT, ALT'09, pp.23-37, 2009.
DOI : 10.1090/S0002-9904-1952-09620-8

S. Bubeck, T. Wang, and N. Viswanathan, Multiple identifications in multi-armed bandits, Proceedings of The 30th ICML, pp.258-265, 2013.

R. Busa-fekete and E. Hüllermeier, A Survey of Preference-Based Online Learning with Bandit Algorithms, Algorithmic Learning Theory (ALT), pp.18-39
DOI : 10.1007/978-3-319-11662-4_3

R. Busa-fekete and B. Kégl, Fast boosting using adversarial bandits, International Conference on Machine Learning (ICML), pp.143-150, 2010.
URL : https://hal.archives-ouvertes.fr/in2p3-00614564

O. Cappé, A. Garivier, O. Maillard, R. Munos, and G. Stoltz, Kullback???Leibler upper confidence bounds for optimal sequential allocation, The Annals of Statistics, vol.41, issue.3, pp.1516-1541, 2013.
DOI : 10.1214/13-AOS1119SUPP

A. Carpentier and M. Valko, Extreme bandits, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pp.1089-1097, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01079354

N. Cesa-bianchi and G. Lugosi, Prediction, Learning and Games, 2006.
DOI : 10.1017/CBO9780511546921

A. Dvoretzky, J. Kiefer, and J. Wolfowitz, Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator, The Annals of Mathematical Statistics, vol.27, issue.3, pp.642-669, 1956.
DOI : 10.1214/aoms/1177728174

E. Even-dar, S. Mannor, and Y. Mansour, PAC Bounds for Multi-armed Bandit and Markov Decision Processes, Proceedings of the 15th Conference on Learning Theory (COLT), pp.255-270, 2002.
DOI : 10.1007/3-540-45435-7_18

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1345

V. Gabillon, M. Ghavamzadeh, A. Lazaric, and S. Bubeck, Multi-bandit best arm identification, Advances in NIPS 24, pp.2222-2230, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00632523

R. Gaudel and M. Sebag, Feature selection as a one-player game, Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp.359-366, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00484049

S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone, Pac subset selection in stochastic multi-armed bandits, Proceedings of the Twenty-ninth International Conference on Machine Learning, pp.655-662, 2012.

V. Kuleshov and D. Precup, Algorithms for multi-armed bandit problems. CoRR, abs/1402, 2014.

T. L. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985.
DOI : 10.1016/0196-8858(85)90002-8

URL : http://doi.org/10.1016/0196-8858(85)90002-8

S. Mannor, . Tsitsiklis, and N. John, The sample complexity of exploration in the multi-armed bandit problem, Journal of Machine Learning Research, vol.5, pp.623-648, 2004.

P. Massart, The tight constant in the dvoretzky-kieferwolfowitz inequality. The Annals of Probability, pp.1269-1283, 1990.

. Sani, . Amir, A. Lazaric, and R. Munos, Riskaversion in multi-armed bandits, 26th Annual Conference on Neural Information Processing Systems (NIPS), pp.3284-3292, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00772609

B. Schachter, An irreverent guide to Value-at-Risk, Financial Engineering News, vol.1, 1997.

J. Yu, . Yuan, and E. Nikolova, Sample complexity of risk-averse bandit-arm selection, Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp.2576-2582, 2013.

Y. Yue, J. Broder, R. Kleinberg, J. , and T. , The K-armed dueling bandits problem, Journal of Computer and System Sciences, vol.78, issue.5, pp.1538-1556, 2012.
DOI : 10.1016/j.jcss.2011.12.028

Y. Yue and T. Joachims, Beat the mean bandit, Proceedings of the International Conference on Machine Learning (ICML), pp.241-248, 2011.

Y. Zhou, . Chen, . Xi, and J. Li, Optimal pac multiple arm identification with applications to crowdsourcing, Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp.217-225, 2014.

M. Zoghi, S. Whiteson, R. Munos, R. , and M. , Relative upper confidence bound for the k-armed dueling bandit problem, Proceedings of the Thirty-First International Conference on Machine Learning (ICML), pp.10-18, 2014.