C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák, Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, Algorithmic Learning Theory, pp.229-243, 2006.
DOI : 10.1007/11894841_20

N. Alon, N. Cesa-bianchi, O. Dekel, K. , and T. , Online learning with feedback graphs: Beyond bandits, Conference on Learning Theory, 2015.

N. Alon, N. Cesa-bianchi, C. Gentile, and Y. Mansour, From bandits to experts: A tale of domination and independence, Neural Information Processing Systems, 2013.

J. Audibert and S. Bubeck, Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.11, pp.2785-2836, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654356

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.
DOI : 10.1137/S0097539701398375

P. Auer, N. Cesa-bianchi, and C. Gentile, Adaptive and Self-Confident On-Line Learning Algorithms, Journal of Computer and System Sciences, vol.64, issue.1, pp.48-75, 2002.
DOI : 10.1006/jcss.2001.1795

S. Buccapatnam, A. Eryilmaz, and N. B. Shroff, Stochastic bandits with side observations on networks, International Conference on Measurement and Modeling of Computer Systems, 2014.

S. Caron, B. Kveton, M. Lelarge, and S. Bhagat, Leveraging side observations in stochastic bandits, Uncertainty in Artificial Intelligence, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01270324

A. Carpentier and M. Valko, Revealing graph bandits for maximizing local influence, International Conference on Artificial Intelligence and Statistics, pp.10-18, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01304020

N. Cesa-bianchi and G. Lugosi, Prediction, learning , and games, 2006.
DOI : 10.1017/CBO9780511546921

N. Cesa-bianchi, G. Lugosi, and G. Stoltz, Minimizing Regret With Label Efficient Prediction, IEEE Transactions on Information Theory, vol.51, issue.6, pp.2152-2162, 2005.
DOI : 10.1109/TIT.2005.847729

URL : https://hal.archives-ouvertes.fr/hal-00007537

A. Cohen, T. Hazan, K. , and T. , Online learning with feedback graphs without the graphs, International Conference on Machine Learning, 2016.

Y. Freund and R. E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, pp.119-139, 1997.
DOI : 10.1006/jcss.1997.1504

L. Györfi and G. Ottucsák, Sequential Prediction of Unbounded Stationary Time Series, IEEE Transactions on Information Theory, vol.53, issue.5, pp.1866-1872, 2007.
DOI : 10.1109/TIT.2007.894660

T. Kocák, G. Neu, and M. Valko, Online learning with noisy side observations, International Conference on Artificial Intelligence and Statistics, pp.1186-1194, 2016.

T. Kocák, G. Neu, M. Valko, M. , and R. , Efficient learning by implicit exploration in bandit problems with side observations, Neural Information Processing Systems, pp.613-621, 2014.

N. Littlestone and M. Warmuth, The Weighted Majority Algorithm, Information and Computation, vol.108, issue.2, pp.212-261, 1994.
DOI : 10.1006/inco.1994.1009

S. Mannor and O. Shamir, From bandits to experts: On the value of side-observations, Neural Information Processing Systems, 2011.

G. Neu and G. Bartók, An Efficient Algorithm for Learning with Semi-bandit Feedback, Algorithmic Learning Theory, 2013.
DOI : 10.1007/978-3-642-40935-6_17

Y. Seldin, P. Bartlett, K. Crammer, and Y. Abbasi-yadkori, Prediction with limited advice and multiarmed bandits with paid observations, International Conference on Machine Learning, 2014.