P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

L. Györfi, L. Devroye, and G. Lugosi, A Probabilistic Theory of Pattern Recognition, 1996.

G. Stoltz, Incomplete information and internal regret in prediction of individual sequences, 2005.
URL : https://hal.archives-ouvertes.fr/tel-00009759