N. Ailon, Z. S. Karnin, J. , and T. , Reducing dueling bandits to cardinal bandits, ICML 2014 JMLR Proceedings, pp.856-864, 2014.

P. Auer, N. Cesa-bianchi, and P. Fischer, Finitetime analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

G. Bartók, A near-optimal algorithm for finite partial-monitoring games against adversarial opponents, Proc. COLT, 2013.

G. Bartók, D. P. Foster, D. Pál, A. Rakhlin, and C. Szepesvári, Partial Monitoring???Classification, Regret Bounds, and Algorithms, Mathematics of Operations Research, vol.39, issue.4, pp.967-997, 2014.
DOI : 10.1287/moor.2014.0663

S. Bubeck and N. Cesa-bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.1-122, 2012.
DOI : 10.1561/2200000024

R. Busa-fekete and E. Hüllermeier, A Survey of Preference-Based Online Learning with Bandit Algorithms, ALT 2014, pp.18-39, 2014.
DOI : 10.1007/978-3-319-11662-4_3

R. Busa-fekete, B. Szörényi, W. Cheng, P. Weng, and E. Hüllermeier, Top-k selection based on adaptive sampling of noisy preferences, ICML 2013 JMLR Proceedings, pp.1094-1102, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01216055

N. Cesa-bianchi and G. Lugosi, Combinatorial bandits, COLT 2009, pp.237-246, 2009.
DOI : 10.1016/j.jcss.2012.01.001

O. Chapelle, T. Joachims, F. Radlinski, Y. , and Y. , Large-scale validation and analysis of interleaved search evaluation, ACM Transactions on Information Systems, vol.30, issue.1, p.6, 2012.
DOI : 10.1145/2094072.2094078

I. Charon and O. Hudry, An updated survey on the linear ordering problem for??weighted or??unweighted tournaments, Annals of Operations Research, vol.29, issue.2, pp.107-158, 2010.
DOI : 10.1007/s10479-009-0648-7

Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer, An efficient boosting algorithm for combining preferences, J. Mach. Learn. Res, vol.4, pp.933-969, 2003.

Y. Freund and R. E. Schapire, Adaptive Game Playing Using Multiplicative Weights, Games and Economic Behavior, vol.29, issue.1-2, pp.79-103, 1999.
DOI : 10.1006/game.1999.0738

J. Fürnkranz and E. Hüllermeier, Preference Learning, 2010.
DOI : 10.1007/978-1-4899-7502-7_667-1

T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinski et al., Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search, ACM Transactions on Information Systems, vol.25, issue.2, 2007.
DOI : 10.1145/1229179.1229181

R. M. Karp and R. Kleinberg, Noisy binary search and its applications, SODA 2007, SIAM Proceedings, pp.881-890, 2007.

T. Liu, Learning to Rank for Information Retrieval, Foundations and Trends?? in Information Retrieval, vol.3, issue.3, pp.225-331, 2009.
DOI : 10.1561/1500000016

T. Liu, J. Xu, T. Qin, W. Xiong, L. et al., LETOR: Benchmark dataset for research on learning to rank for information retrieval, SIGIR, 2007.

A. Piccolboni and C. Schindelhauer, Discrete prediction games with arbitrary feedback and loss, COLT/EuroCOLT, pp.208-223, 2001.

F. Radlinski and T. Joachims, Active exploration for learning rankings from clickthrough data, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '07, pp.570-579, 2007.
DOI : 10.1145/1281192.1281254

Y. Seldin, C. Szepesvári, P. Auer, and Y. Abbasi-yadkori, Evaluation and analysis of the performance of the exp3 Algorithm in stochastic environments, EWRL, volume 24 of JMLR Proceedings, pp.103-116, 2012.

T. Urvoy, F. Clerot, R. Féraud, and S. Naamane, Generic exploration and K-armed voting bandits, ICML 2013 JMLR Proceedings, pp.91-99, 2013.

Y. Yue and T. Joachims, Interactively optimizing information retrieval systems as a dueling bandits problem, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.1201-1208, 2009.
DOI : 10.1145/1553374.1553527

Y. Yue and T. Joachims, Beat the mean bandit, ICML 2011, pp.241-248, 2011.

M. Zoghi, S. Whiteson, R. Munos, and M. De-rijke, Relative upper confidence bound for the karmed dueling bandit problem, ICML 2014 JMLR Proceedings, pp.10-18, 2014.

M. Zoghi, S. A. Whiteson, M. De-rijke, M. , and R. , Relative confidence sampling for efficient online ranker evaluation, WSDM 2014, pp.73-82, 2014.