Y. Abbasi-yadkori, D. Pal, and C. Szepesvari, Improved Algorithms for Linear Stochastic Bandits, Neural Information Processing Systems, pp.1-19, 2011.

J. Audibert, S. Bubeck, and G. Lugosi, Regret in Online Combinatorial Optimization, Mathematics of Operations Research, vol.39, issue.1, pp.31-45, 2013.
DOI : 10.1287/moor.2013.0598

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002.
DOI : 10.1023/A:1013689704352

A. Carpentier and R. Munos, Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit, Advances in Neural Information Processing Systems (NIPS), pp.251-259, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00659731

N. Cesa-bianchi and G. Lugosi, Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, pp.1404-1422, 2012.
DOI : 10.1016/j.jcss.2012.01.001

W. Chen, Y. Wang, and Y. Yuan, Combinatorial multi-armed bandit: General framework and applications, Proceedings of the 30th International Conference on Machine Learning (ICML), pp.151-159, 2013.

S. Filippi, O. Cappé, A. Garivier, and C. Szepesvári, Parametric Bandits: The Generalized Linear Case, Neural Information Processing Systems, pp.1-9, 2010.

Y. Gai, B. Krishnamachari, and R. Jain, Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations, IEEE/ACM Transactions on Networking, vol.20, issue.5, pp.1466-1478, 2012.
DOI : 10.1109/TNET.2011.2181864

A. Garivier, Informational confidence bounds for self-normalized averages and applications. 2013 IEEE Information Theory Workshop, 2013.
DOI : 10.1109/itw.2013.6691311

URL : https://hal.archives-ouvertes.fr/hal-00862062

J. Komiyama, J. Honda, and H. Nakagawa, Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays, Proceedings of the 32nd International Conference on Machine Learning, 2015.

Z. Branislav-kveton and . Wen, Azin Ashkan, and Csaba Szepesvari. Tight regret bounds for stochastic combinatorial semi-bandits, Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, 2015.

T. Leung, L. , and H. Robbins, Asymptotically efficient adaptive allocation rules, Advances in applied mathematics, vol.6, issue.1, pp.4-22, 1985.

H. Victor, T. Peña, Q. Leung-lai, and . Shao, Self-normalized processes: Limit theory and Statistical Applications, 2008.

H. Robbins, Some aspects of the sequential design of experiments, Herbert Robbins Selected Papers, pp.169-177, 1985.

P. Rusmevichientong and J. N. Tsitsiklis, Linearly Parameterized Bandits, Mathematics of Operations Research, vol.35, issue.2, pp.1-40, 1985.
DOI : 10.1287/moor.1100.0446