J. Abernethy, E. Hazan, and A. Rakhlin, Interior-point methods for full-information and bandit online learning. Information Theory, IEEE Transactions on, vol.58, issue.7, pp.4164-4175, 2012.
DOI : 10.1109/tit.2012.2192096
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.362.4515

J. Abernethy, C. Lee, A. Sinha, and A. Tewari, Online linear optimization via smoothing, pp.807-823, 2014.

C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák, Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, Proceedings of the 17th International Conference on Algorithmic Learning Theory, pp.229-243, 2006.
DOI : 10.1007/11894841_20

J. Audibert and S. Bubeck, Minimax policies for adversarial and stochastic bandits, Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009.
URL : https://hal.archives-ouvertes.fr/hal-00834882

J. Audibert and S. Bubeck, Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.11, pp.2635-2686, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654356

J. Audibert, S. Bubeck, and G. Lugosi, Regret in Online Combinatorial Optimization, Mathematics of Operations Research, vol.39, issue.1, pp.31-45, 2014.
DOI : 10.1287/moor.2013.0598

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

P. Auer, N. Cesa-bianchi, and C. Gentile, Adaptive and Self-Confident On-Line Learning Algorithms, Journal of Computer and System Sciences, vol.64, issue.1, pp.48-75, 2002.
DOI : 10.1006/jcss.2001.1795

N. Cesa-bianchi and G. Lugosi, Prediction, Learning, and Games, 2006.
DOI : 10.1017/CBO9780511546921

N. Cesa-bianchi and G. Lugosi, Combinatorial bandits, Proceedings of the 22nd Annual Conference on Learning Theory, pp.237-246, 2009.
DOI : 10.1016/j.jcss.2012.01.001

N. Cesa-bianchi and G. Lugosi, Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, pp.1404-1422, 2012.
DOI : 10.1016/j.jcss.2012.01.001

N. Cesa-bianchi, Y. Mansour, and G. Stoltz, Improved second-order bounds for prediction with expert advice, Proceedings of the 18th Annual Conference on Learning Theory (COLT-2005), pp.217-232, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00019799

W. Chen, Y. Wang, and Y. Yuan, Combinatorial multi-armed bandit: General framework and applications, Proceedings of the 30th International Conference on Machine Learning Conference Proceedings, pp.151-159, 2013.

L. Devroye, G. Lugosi, and G. Neu, Prediction by random-walk perturbation, pp.460-473, 2013.

Y. Gai, B. Krishnamachari, and R. Jain, Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations, IEEE/ACM Transactions on Networking, vol.20, issue.5, pp.1466-1478, 2012.
DOI : 10.1109/TNET.2011.2181864

A. György, T. Linder, G. Lugosi, and G. Ottucsák, The on-line shortest path problem under partial monitoring, Journal of Machine Learning Research, vol.8, pp.2369-2403, 2007.

J. Hannan, Approximation to Bayes risk in repeated play. Contributions to the theory of games, pp.97-139, 1957.

E. Hazan and S. Kale, Extracting certainty from uncertainty: regret bounded byvariation incosts, Machine Learning, pp.165-188, 2010.
DOI : 10.1007/s10994-010-5175-x
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.139.615

E. Hazan and S. Kale, Better Algorithms for Benign Bandits, The Journal of Machine Learning Research, vol.12, pp.1287-1311, 2011.
DOI : 10.1137/1.9781611973068.5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.139.5763

M. Hutter and J. Poland, Prediction with Expert Advice by Following the Perturbed Leader for General Weights, Proceedings of the 15th International Conference on Algorithmic Learning Theory (ALT), pp.279-293, 2004.
DOI : 10.1007/978-3-540-30215-5_22

A. Kalai and S. Vempala, Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.71, issue.3, pp.291-307, 2005.
DOI : 10.1016/j.jcss.2004.10.016

T. Kocák, G. Neu, M. Valko, and R. , Efficient learning by implicit exploration in bandit problems with side observations, Ghahramani et, pp.613-621, 2014.

W. M. Koolen, M. K. Warmuth, and J. Kivinen, Hedging structured concepts, Proceedings of the 23rd Annual Conference on Learning Theory (COLT), pp.93-105, 2010.

B. Kveton, Z. Wen, A. Ashkan, and C. Szepesvári, Tight regret bounds for stochastic combinatorial semi-bandits, AISTATS, 2015.

G. Neu and G. Bartók, An Efficient Algorithm for Learning with Semi-bandit Feedback, Proceedings of the 24th International Conference on Algorithmic Learning Theory, pp.234-248, 2013.
DOI : 10.1007/978-3-642-40935-6_17

J. Poland, FPL Analysis for Adaptive Bandits, 3rd Symposium on Stochastic Algorithms, Foundations and Applications (SAGA'05), pp.58-69, 2005.
DOI : 10.1007/11571155_7

A. Rakhlin and K. Sridharan, Online learning with predictable sequences, pp.993-1019, 2013.

S. Rakhlin, O. Shamir, and K. Sridharan, Relax and randomize : From value to algorithms, Advances in Neural Information Processing Systems 25, pp.2150-2158

A. Sani, G. Neu, and A. Lazaric, Exploiting easy data in online optimization, pp.810-818, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01079428

G. Stoltz, Incomplete information and internal regret in prediction of individual sequences, 2005.
URL : https://hal.archives-ouvertes.fr/tel-00009759

T. Van-erven, M. Warmuth, and W. Kot?owski, Follow the leader with dropout perturbations, pp.949-974, 2014.