J. Audibert, S. Bubeck, and R. Munos, Best arm identification in multi-armed bandits, Proceedings of the Twenty-Third Annual Conference on Learning Theory, pp.41-53, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654404

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multi-armed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

S. Bubeck, R. Munos, and G. Stoltz, Pure Exploration in Multi-armed Bandits Problems, Proceedings of the Twentieth International Conference on Algorithmic Learning Theory, pp.23-37, 2009.
DOI : 10.1090/S0002-9904-1952-09620-8

S. Bubeck, T. Wang, and N. Viswanathan, Multiple identifications in multi-armed bandits. CoRR, abs/1205, p.3181, 2012.

K. Deng, J. Pineau, and S. Murphy, Active learning for developing personalized treatment, Proceedings of the Twenty-Seventh International Conference on Uncertainty in Artificial Intelligence, pp.161-168, 2011.
DOI : 10.1109/icmla.2014.8

E. Even-dar, S. Mannor, and Y. Mansour, Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems, Journal of Machine Learning Research, vol.7, pp.1079-1105, 2006.

V. Gabillon, M. Ghavamzadeh, A. Lazaric, and S. Bubeck, Multi-bandit best arm identification, Proceedings of Advances in Neural Information Processing Systems 25, pp.2222-2230, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00632523

S. Kalyanakrishnan, Learning Methods for Sequential Decision Making with Imperfect Representations, 2011.

S. Kalyanakrishnan and P. Stone, Efficient selection of multiple bandit arms: Theory and practice, Proceedings of the Twenty-Seventh International Conference on Machine Learning, pp.511-518, 2010.

S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone, Pac subset selection in stochastic multiarmed bandits, Proceedings of the Twentieth International Conference on Machine Learning, 2012.

O. Maron and A. Moore, Hoeffding races: Accelerating model selection search for classification and function approximation, Proceedings of Advances in Neural Information Processing Systems 6, pp.59-66, 1993.

A. Maurer and M. Pontil, Empirical bernstein bounds and sample-variance penalization, 22th annual conference on learning theory, 2009.

V. Mnih, C. Szepesvári, and J. Audibert, Empirical Bernstein stopping, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.672-679, 2008.
DOI : 10.1145/1390156.1390241
URL : https://hal.archives-ouvertes.fr/hal-00834983

H. Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952.
DOI : 10.1090/S0002-9904-1952-09620-8