Reducing dueling bandits to cardinal bandits, ICML 2014 JMLR Proceedings, pp.856-864, 2014. ,
Finitetime analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002. ,
DOI : 10.1137/S0097539701398375
A near-optimal algorithm for finite partial-monitoring games against adversarial opponents, Proc. COLT, 2013. ,
Partial Monitoring???Classification, Regret Bounds, and Algorithms, Mathematics of Operations Research, vol.39, issue.4, pp.967-997, 2014. ,
DOI : 10.1287/moor.2014.0663
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.1-122, 2012. ,
DOI : 10.1561/2200000024
A Survey of Preference-Based Online Learning with Bandit Algorithms, ALT 2014, pp.18-39, 2014. ,
DOI : 10.1007/978-3-319-11662-4_3
Top-k selection based on adaptive sampling of noisy preferences, ICML 2013 JMLR Proceedings, pp.1094-1102, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-01216055
Combinatorial bandits, COLT 2009, pp.237-246, 2009. ,
DOI : 10.1016/j.jcss.2012.01.001
Large-scale validation and analysis of interleaved search evaluation, ACM Transactions on Information Systems, vol.30, issue.1, p.6, 2012. ,
DOI : 10.1145/2094072.2094078
An updated survey on the linear ordering problem for??weighted or??unweighted tournaments, Annals of Operations Research, vol.29, issue.2, pp.107-158, 2010. ,
DOI : 10.1007/s10479-009-0648-7
An efficient boosting algorithm for combining preferences, J. Mach. Learn. Res, vol.4, pp.933-969, 2003. ,
Adaptive Game Playing Using Multiplicative Weights, Games and Economic Behavior, vol.29, issue.1-2, pp.79-103, 1999. ,
DOI : 10.1006/game.1999.0738
Preference Learning, 2010. ,
DOI : 10.1007/978-1-4899-7502-7_667-1
Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search, ACM Transactions on Information Systems, vol.25, issue.2, 2007. ,
DOI : 10.1145/1229179.1229181
Noisy binary search and its applications, SODA 2007, SIAM Proceedings, pp.881-890, 2007. ,
Learning to Rank for Information Retrieval, Foundations and Trends?? in Information Retrieval, vol.3, issue.3, pp.225-331, 2009. ,
DOI : 10.1561/1500000016
LETOR: Benchmark dataset for research on learning to rank for information retrieval, SIGIR, 2007. ,
Discrete prediction games with arbitrary feedback and loss, COLT/EuroCOLT, pp.208-223, 2001. ,
Active exploration for learning rankings from clickthrough data, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '07, pp.570-579, 2007. ,
DOI : 10.1145/1281192.1281254
Evaluation and analysis of the performance of the exp3 Algorithm in stochastic environments, EWRL, volume 24 of JMLR Proceedings, pp.103-116, 2012. ,
Generic exploration and K-armed voting bandits, ICML 2013 JMLR Proceedings, pp.91-99, 2013. ,
Interactively optimizing information retrieval systems as a dueling bandits problem, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.1201-1208, 2009. ,
DOI : 10.1145/1553374.1553527
Beat the mean bandit, ICML 2011, pp.241-248, 2011. ,
Relative upper confidence bound for the karmed dueling bandit problem, ICML 2014 JMLR Proceedings, pp.10-18, 2014. ,
Relative confidence sampling for efficient online ranker evaluation, WSDM 2014, pp.73-82, 2014. ,