J. , Y. Audibert, and S. Bubeck, Minimax policies for adversarial and stochastic bandits, Conference on Learning Theory, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00834882

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, pp.235-256, 2002.

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The nonstochastic multi-armed bandit problem, Journal on Computing, vol.32, issue.1, pp.48-77, 2002.