J. , Y. Audibert, and S. Bubeck, Minimax policies for adversarial and stochastic bandits, Conference on Learning Theory, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00834882

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, pp.235-256, 2002.

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The nonstochastic multi-armed bandit problem, Journal on Computing, vol.32, issue.1, pp.48-77, 2002.

O. Besbes, Y. Gur, and A. Zeevi, Stochastic multi-armed bandit problem with non-stationary rewards, Neural Information Processing Systems, 2014.

A. Bifet and R. Gavaldà, Learning from time-changing data with adaptive windowing, International Conference on Data Mining, 2007.

D. Bouneffouf and R. Féraud, Multi-armed bandit problem with known trend, Neurocomputing, vol.205, issue.C, pp.16-21, 2016.