. Auer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, pp.235-256, 2002.

. Braverman, M. Braverman, and E. Mossel, Noisy sorting without resampling, Proceedings of the nineteenth annual ACM-SIAM Symposium on Discrete algorithms, pp.268-276, 2008.

M. Braverman and E. Mossel, Sorting from noisy information, 2009.

. Bubeck, Pure Exploration in Multi-armed Bandits Problems, Proceedings of the 20th international conference on Algorithmic learning theory, ALT'09, pp.23-37, 2009.
DOI : 10.1090/S0002-9904-1952-09620-8

. Bubeck, Multiple identifications in multi-armed bandits, Proceedings of The 30th International Conference on Machine Learning, pp.258-265, 2013.

. Busa-fekete, Top-k selection based on adaptive sampling of noisy preferences, Proceedings of the 30th International Conference on Machine Learning, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01216055

. Cappé, Kullback???Leibler upper confidence bounds for optimal sequential allocation, The Annals of Statistics, vol.41, issue.3, 2012.
DOI : 10.1214/13-AOS1119SUPP

D. Chafa¨?chafa¨? and D. Concordet, Confidence Regions for the Multinomial Parameter With Small Sample Size, Journal of the American Statistical Association, vol.104, issue.487, pp.1071-1079, 2009.
DOI : 10.1198/jasa.2009.tm08152

B. Eriksson, Learning to Top-K search using pairwise comparisons, Journal of Machine Learning Research -Proceedings Track, vol.31, pp.265-273, 2013.

. Even-dar, PAC Bounds for Multi-armed Bandit and Markov Decision Processes, Proceedings of the 15th Annual Conference on Computational Learning Theory, pp.255-270, 2002.
DOI : 10.1007/3-540-45435-7_18

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1345

. Feige, Computing with Noisy Information, SIAM Journal on Computing, vol.23, issue.5, pp.1001-1018, 1994.
DOI : 10.1137/S0097539791195877

. Gabillon, Multi-bandit best arm identification, Advances in Neural Information Processing Systems 24, pp.2222-2230, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00632523

. Guo, Score-Based Bayesian Skill Learning, European Conference on Machine Learning, pp.1-16, 2012.
DOI : 10.1007/978-3-642-33460-3_12

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.680.4832

W. Hoeffding, Probability Inequalities for Sums of Bounded Random Variables, Journal of the American Statistical Association, vol.1, issue.301, pp.13-30, 1963.
DOI : 10.1214/aoms/1177730491

S. Kalyanakrishnan, Learning Methods for Sequential Decision Making with Imperfect Representations, 2011.

. Kalyanakrishnan, Pac subset selection in stochastic multi-armed bandits, Proceedings of the Twenty-ninth International Conference on Machine Learning, pp.655-662, 2012.

M. Kendall, Rank Correlation Methods., Biometrika, vol.44, issue.1/2, 1955.
DOI : 10.2307/2333282

H. Moulin, Axioms of cooperative decision making, 1988.
DOI : 10.1017/CCOL0521360552

. Urvoy, Generic exploration and K-armed voting bandits, Proceedings of the 30th International Conference on Machine Learning, JMLR W&CP, pp.91-99, 2013.

. Yue, The K-armed dueling bandits problem, Journal of Computer and System Sciences, vol.78, issue.5, pp.781538-1556, 2012.
DOI : 10.1016/j.jcss.2011.12.028