J. Mitola and G. Q. Maguire, Cognitive radio: making software radios more personal, IEEE Personal Communications, vol.6, issue.4, pp.13-18, 1999.
DOI : 10.1109/98.788210

S. Haykin, Cognitive radio: brain-empowered wireless communications, IEEE Journal on Selected Areas in Communications, vol.23, issue.2, pp.201-220, 2005.
DOI : 10.1109/JSAC.2004.839380

URL : http://www.eecs.berkeley.edu/~dtse/3r_haykin_jsac05.pdf

H. Robbins, Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952.
DOI : 10.1090/S0002-9904-1952-09620-8

T. L. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985.
DOI : 10.1016/0196-8858(85)90002-8

URL : https://doi.org/10.1016/0196-8858(85)90002-8

Q. Zhao and B. M. Sadler, A Survey of Dynamic Spectrum Access, IEEE Signal Processing Magazine, vol.24, issue.3, pp.79-89, 2007.
DOI : 10.1109/MSP.2007.361604

W. Jouini, D. Ernst, C. Moy, and J. Palicot, Upper Confidence Bound Based Decision Making Strategies and Dynamic Spectrum Access, 2010 IEEE International Conference on Communications, pp.1-5, 2010.
DOI : 10.1109/ICC.2010.5502014

URL : https://hal.archives-ouvertes.fr/hal-00489331

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

URL : http://homepages.math.uic.edu/%7Elreyzin/f14_mcs548/auer02.pdf

A. Garivier and O. Cappé, The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond, COLT, pp.359-376, 2011.

E. Kaufmann, O. Cappé, and A. Garivier, On Bayesian Upper Confidence Bounds for Bandit Problems, AISTATS, pp.592-600, 2012.

S. Bubeck and N. Cesa-bianchi, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Machine Learning, pp.1-122, 2012.
DOI : 10.1561/2200000024

URL : http://arxiv.org/pdf/1204.5721.pdf

W. Jouini, C. Moy, and J. Palicot, Decision making for cognitive radio equipment: analysis of the first 10 years of exploration, EURASIP Journal on Wireless Communications and Networking, vol.47, issue.2/3, pp.1-16, 2012.
DOI : 10.1023/A:1013689704352

URL : https://hal.archives-ouvertes.fr/hal-00682511

O. Maillard and R. Munos, Adaptive Bandits: Towards the best history-dependent strategy, AISTATS, pp.570-578, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00574999

W. R. Thompson, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, vol.254, issue.3, pp.285-294, 1933.

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-Time Analysis of the Multi-Armed Bandit Problem, Machine Learning, pp.235-256, 2002.

S. Agrawal and N. Goyal, Analysis of Thompson sampling for the Multi-Armed Bandit problem, JMLR, Conference On Learning Theory, pp.39-40, 2012.

E. Kaufmann, N. Korda, and R. Munos, Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis, pp.199-213, 2012.
DOI : 10.1007/978-3-642-34106-9_18

URL : https://hal.archives-ouvertes.fr/hal-00830033

A. Agarwal, H. Luo, B. Neyshabur, and R. E. Schapire, Corralling a Band of Bandit Algorithms, 2016.

A. Singla, H. Hassani, and A. Krause, Learning to Use Learners' Advice, 2017.

H. Luo, A. Agarwal, and J. Langford, Efficient Contextual Bandits in Non-stationary Worlds, 2017.