. Abbasi-yadkori, D. Yasin, C. Pál, and . Szepesvári, Improved algorithms for linear stochastic bandits, Neural Information Processing Systems, p.9, 2011.

S. Agrawal and N. Goyal, Thompson sampling for contextual bandits with linear payoffs, International Conference on Machine Learning, p.18, 2013.

A. Alaoui, M. W. El, and . Mahoney, Fast randomized kernel methods with statistical guarantees, Neural Information Processing Systems, 2015.

N. Alon, N. Cesa-bianchi, O. Dekel, and T. Koren, Online learning with feedback graphs: Beyond bandits, Conference on Learning Theory, pp.24-34, 2015.

N. Alon, N. Cesa-bianchi, C. Gentile, S. Mannor, Y. Mansour et al., Nonstochastic multi-armed bandits with graph-structured feedback, 2014.

N. Alon, N. Cesa-bianchi, C. Gentile, and Y. Mansour, From bandits to experts: A tale of domination and independence, Neural Information Processing Systems, pp.26-28, 2013.

A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen, Diversified utility maximization for recommendations, Conference on Recommender Systems, 2014.

J. Audibert and S. Bubeck, Minimax policies for adversarial and stochastic bandits, Conference on Learning Theory, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00834882

J. Audibert, S. Bubeck, and G. Lugosi, Regret in Online Combinatorial Optimization, Mathematics of Operations Research, vol.39, issue.1, pp.31-45, 2014.
DOI : 10.1287/moor.2013.0598

J. Audibert, S. Bubeck, and R. Munos, Best arm identification in multi-armed bandits, Conference on Learning Theory, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00654404

P. Auer, Using confidence bounds for exploitation-exploration trade-offs, In: Journal of Machine Learning Research, vol.3, issue.19, pp.397-422, 2002.

P. Auer, N. Cesa-bianchi, and P. Fischer, Finite-time analysis of the multiarmed bandit problem, Machine Learning 47.2-3, pp.235-256, 2002.

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The nonstochastic multi-armed bandit problem, In: Journal on Computing, vol.321, issue.27, pp.48-77, 2002.

P. Auer and R. Ortner, UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem, Periodica Mathematica Hungarica, vol.5, issue.1-2, 2010.
DOI : 10.1007/s10998-010-3055-6

M. Azar, A. Gheshlaghi, E. Lazaric, and . Brunskill, Online Stochastic Optimization under Correlated Bandit Feedback, International Conference on Machine Learning, pp.63-67, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01080138

K. Azuma, Weighted sums of certain dependent random variables, Tohoku Mathematical Journal, vol.19, issue.3, pp.357-367, 1967.
DOI : 10.2748/tmj/1178243286

V. Bala and S. Goyal, Learning from Neighbours, Review of Economic Studies, vol.65, issue.3, pp.595-621, 1998.
DOI : 10.1111/1467-937X.00059

Y. Bao, X. Wang, Z. Wang, C. Wu, and F. C. Lau, Online influence maximization in non-stationary Social Networks, 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS), 2016.
DOI : 10.1109/IWQoS.2016.7590438

A. Barabási and R. Albert, Emergence of scaling in random networks, Science, vol.286, pp.11-37, 1999.

G. Bartók, D. P. Foster, D. Pál, A. Rakhlin, and C. Szepesvári, Partial Monitoring???Classification, Regret Bounds, and Algorithms, Mathematics of Operations Research, vol.39, issue.4, pp.967-997, 2014.
DOI : 10.1287/moor.2014.0663

G. Bartók, D. Pál, and C. Szepesvári, Minimax regret of finite partial-monitoring games in stochastic environments, Conference on Learning Theory, 2011.

M. Belkin, I. Matveeva, and P. Niyogi, Regularization and semi-supervised learning on large graphs, Conference on Learning Theory, 2004.

M. Belkin, P. Niyogi, and V. Sindhwani, Manifold regularization: A geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research, vol.7, pp.2399-2434, 2006.

D. A. Berry, W. Robert, A. Chen, D. C. Zame, L. A. Heath et al., Bandit problems with infinitely many arms, The Annals of Statistics, vol.25, issue.5, pp.2103-2116, 1997.
DOI : 10.1214/aos/1069362389

D. Bertsimas and J. Tsitsiklis, Introduction to linear optimization, Athena Scientific, 1997.

D. Billsus, M. J. Pazzani, and J. Chen, A learning agent for wireless news access, Proceedings of the 5th international conference on Intelligent user interfaces , IUI '00, 2000.
DOI : 10.1145/325737.325768

Z. Bnaya, R. Puzis, R. Stern, and A. Felner, Social network search as a volatile multi-armed bandit problem, Human Journal 2.2, pp.84-98, 2013.

T. Bonald and A. Proutière, Two-target algorithms for infinite-armed bandits with Bernoulli rewards, Neural Information Processing Systems, pp.71-75, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00920045

S. Bubeck, R. Munos, and G. Stoltz, Pure exploration in finitely-armed and continuous-armed bandits, Theoretical Computer Science 412, pp.1832-1852, 2011.
DOI : 10.1016/j.tcs.2010.12.059
URL : https://hal.archives-ouvertes.fr/hal-00609550

S. Buccapatnam, A. Eryilmaz, and N. B. Shroff, Stochastic bandits with side observations on networks, International Conference on Measurement and Modeling of Computer Systems, p.33, 2014.

A. D. Bull, Adaptive-treed bandits, Bernoulli 21, pp.2289-2307, 2015.
DOI : 10.3150/14-BEJ644SUPP

A. N. Burnetas, N. Michaël, and . Katehakis, Optimal Adaptive Policies for Sequential Allocation Problems, Advances in Applied Mathematics, vol.17, issue.2, pp.122-142, 1996.
DOI : 10.1006/aama.1996.0007

D. Calandriello, A. Lazaric, and M. Valko, Analysis of Nyström method with sequential ridge leverage scores, In: Uncertainty in Artificial Intelligence, 2016.

S. Caron, B. Kveton, M. Lelarge, and S. Bhagat, Leveraging side observations in stochastic bandits, In: Uncertainty in Artificial Intelligence, vol.39, p.33, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01270324

A. Carpentier and M. Valko, Simple regret for infinitely many armed bandits, International Conference on Machine Learning, pp.74-76, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01153538

O. Catoni, Challenging the empirical mean and empirical variance: A deviation study, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques, pp.1148-1185, 2012.
DOI : 10.1214/11-AIHP454
URL : https://hal.archives-ouvertes.fr/hal-00517206

. Cesa-bianchi, C. Nicolò, Y. Gentile, A. Mansour, and . Minora, Delay and cooperation in nonstochastic bandits, Conference on Learning Theory, 2016.

. Cesa-bianchi, C. Nicolò, G. Gentile, and . Zappella, A gang of bandits, Neural Information Processing Systems, pp.20-39, 2013.

N. Cesa-bianchi and G. Lugosi, Prediction, learning, and games, 2006.
DOI : 10.1017/CBO9780511546921

D. Chau, A. Horng, J. I. Kittur, C. Hong, and . Faloutsos, Apolo, Proceedings of the 2011 annual conference on Human factors in computing systems, CHI '11, 2011.
DOI : 10.1145/1978942.1978967

W. Chen, C. Wang, and Y. Wang, Scalable influence maximization for prevalent viral marketing in large-scale social networks, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '10, 2010.
DOI : 10.1145/1835804.1835934

W. Chen, Y. Wang, and Y. Yuan, Combinatorial multi-armed bandit and its extension to probabilistically triggered arms, Journal of Machine Learning Research, vol.17, 2016.

L. Chu, L. Li, L. Reyzin, E. Robert, and . Schapire, Contextual bandits with linear payoff functions, International Conference on Artificial Intelligence and Statistics, pp.19-51, 2011.

A. Cohen, T. Hazan, and T. Koren, Online learning with feedback graphs without the graphs, International Conference on Machine Learning, 2016.

R. Combes and A. Proutière, Unimodal bandits: Regret lower bounds and optimal algorithms, International Conference on Machine Learning, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01092662

E. Contal and N. Vayatis, Stochastic process bandits: Upper confidence bounds algorithms via generic chaining, 2016.

P. Coquelin and R. Munos, Bandit algorithms for tree search, In: Uncertainty in Artificial Intelligence, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00150207

R. Coulom, Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, Computers and games 4630, pp.72-83, 2007.
DOI : 10.1007/978-3-540-75538-8_7
URL : https://hal.archives-ouvertes.fr/inria-00116992

T. Desautels, A. Krause, and J. Burdick, Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization, International Conference on Machine Learning, 2012.

J. Edmonds, Submodular Functions, Matroids, and Certain Polyhedra, Combinatorial Structures and Their Applications, pp.69-87, 1970.
DOI : 10.1007/3-540-36478-1_2

G. Ellison and D. Fudenberg, Rules of Thumb for Social Learning, Journal of Political Economy, vol.101, issue.4, pp.612-643, 1993.
DOI : 10.1086/261890

P. Erd?-os and A. Rényi, On random graphs, In: Publicationes Mathematicae, vol.6, pp.290-297, 1959.

M. Fang and D. Tao, Networked bandits with disjoint linear payoffs, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '14, 2014.
DOI : 10.1145/2623330.2623672

M. Farajtabar, X. Ye, S. Harati, L. Song, and H. Zha, Multistage campaigning in social networks, Neural Information Processing Systems, 2016.

S. Fujishige, Submodular functions and optimization. Annals of discrete mathematics, 2005.

V. Gabillon, B. Kveton, Z. Wen, B. Eriksson, and S. Muthukrishnan, Adaptive submodular maximization in bandit setting, Neural Information Processing Systems, 2013.

Y. Gai, B. Krishnamachari, and R. Jain, Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations, IEEE/ACM Transactions on Networking, vol.20, issue.5, pp.1466-1478, 2012.
DOI : 10.1109/TNET.2011.2181864

D. Gale and S. Kariv, Bayesian learning in social networks, Games and Economic Behavior, vol.45, issue.2, pp.329-346, 2003.
DOI : 10.1016/S0899-8256(03)00144-1

. Gelly, W. Sylvain, R. Yizao, O. Munos, and . Teytaud, Modification of UCT with patterns in Monte-Carlo Go, p.117266, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00117266

C. Gentile, S. Li, and G. Zappella, Online clustering of bandits, International Conference on Machine Learning, p.21, 2014.

S. Ghosh and A. Prügel-bennett, Ising Bandits with Side Information, European Conference on Machine Learning, 2015.
DOI : 10.1007/978-3-319-23528-8_28

M. Girvan, E. Mark, and . Newman, Community structure in social and biological networks, Proceedings of the National Academy of Sciences, vol.99, issue.12, pp.7821-7827, 2002.
DOI : 10.1073/pnas.122653799

. Grill, M. Jean-bastien, R. Valko, and . Munos, Black-box optimization of noisy functions with unknown smoothness, Neural Information Processing Systems, pp.63-65, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01222915

Q. Gu and J. Han, Online Spectral Learning on a Graph with Bandit Feedback, 2014 IEEE International Conference on Data Mining, p.20, 2014.
DOI : 10.1109/ICDM.2014.72

A. Guillory and J. Bilmes, Online submodular set cover, ranking, and repeated active learning, Neural Information Processing Systems, 2011.

F. Guillou, R. Gaudel, and P. Preux, Collaborative filtering as a multiarmed bandit, NIPS Workshop on Machine Learning for eCommerce, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01256254

B. Haasdonk, E. Zbieta, and P. Ekalska, Classification with kernel Mahalanobis distance classifiers Advances in Data Analysis, Data Handling and Business Intelligence, pp.351-361, 2010.

J. Hannan, Approximation to Bayes risk in repeated play, Contributions to the theory of games 3, pp.97-139, 1957.

M. Hauskrecht, R. Pelikan, M. Valko, and J. Lyons-weiler, Feature Selection and Dimensionality Reduction in Genomics and Proteomics, Fundamentals of Data Mining in Genomics and Proteomics, 2006.
DOI : 10.1007/978-0-387-47509-7_7
URL : https://hal.archives-ouvertes.fr/hal-00643496

D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich, Recommender systems: An introduction, 2010.
DOI : 10.1017/CBO9780511763113

A. Kalai and S. Vempala, Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.713, pp.291-307, 2005.

J. Kawale, H. H. Bui, B. Kveton, L. Tran-thanh, and S. Chawla, Efficient Thompson sampling for online matrix-factorization recommendation, Neural Information Processing Systems, 2015.

D. Kempe, J. Kleinberg, and É. Tardos, Maximizing the spread of influence through a social network, Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '03, p.137, 2003.
DOI : 10.1145/956750.956769

R. Kleinberg, A. Slivkins, and E. Upfal, Multi-armed bandit problems in metric spaces, Symposium on Theory Of Computing, pp.63-66, 2008.

T. Kocák, G. Neu, and M. Valko, Online learning with Erd? os-Rényi side-observation graphs, In: Uncertainty in Artificial Intelligence, vol.36, p.35, 2016.

T. Kocák, G. Neu, M. Valko, and R. Munos, Efficient learning by implicit exploration in bandit problems with side observations, Neural Information Processing Systems, pp.30-32, 2014.

T. Kocák, M. Valko, R. Munos, and S. Agrawal, Spectral Thompson sampling, AAAI Conference on Artificial Intelligence, p.8, 2014.

L. Kocsis and C. Szepesvári, Bandit Based Monte-Carlo Planning, European Conference on Machine Learning, p.63, 2006.
DOI : 10.1007/11871842_29
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296

R. Kolla, K. Kumar, A. Jagannathan, and . Gopalan, Collaborative learning of stochastic bandits over a social network, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2016.
DOI : 10.1109/ALLERTON.2016.7852375

W. M. Koolen, K. Manfred, J. Warmuth, and . Kivinen, Hedging structured concepts, Conference on Learning Theory, pp.28-29, 2010.

N. Korda, B. Szörényi, and S. Li, Distributed clustering of linear bandits in peer to peer networks, International Conference on Machine Learning, 2016.

I. Koutis, G. L. Miller, and D. Tolliver, Combinatorial preconditioners and multilevel solvers for problems in computer vision and image processing, Computer Vision and Image Understanding, vol.11512, pp.1638-1646, 2011.

. Kveton, Z. Branislav, A. Wen, H. Ashkan, B. Eydgahi et al., Matroid bandits: Fast combinatorial optimization with learning, In: Uncertainty in Artificial Intelligence, pp.58-60, 2014.

. Kveton, Z. Branislav, A. Wen, M. Ashkan, and . Valko, Learning to act greedily: Polymatroid semi-bandits, Journal of Machine Learning Research, pp.47-60, 2016.

S. Lei, S. Maniu, L. Mo, R. Cheng, and P. Senellart, Online influence maximization, In: Knowledge Discovery and Data mining, 2015.

L. Li, W. Chu, J. Langford, and R. E. Schapire, A contextual-bandit approach to personalized news article recommendation, Proceedings of the 19th international conference on World wide web, WWW '10, pp.16-18, 2010.
DOI : 10.1145/1772690.1772758

L. Li, K. Jamieson, G. Desalvo, A. Rostamizadeh, and A. Talwalkar, Efficient hyperparameter optimization and infinitely many armed bandits, 2016.

S. Li, A. Karatzoglou, and C. Gentile, Collaborative Filtering Bandits, Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR '16, 2016.
DOI : 10.1145/2911451.2911548

Y. Ma, T. Huang, and J. Schneider, Active search and bandits on graphs using sigma-optimality, In: Uncertainty in Artificial Intelligence, 2015.

S. Mannor and O. Shamir, From bandits to experts: On the value of side-observations, Neural Information Processing Systems, pp.23-26, 2011.

J. Mary, R. Gaudel, and P. Preux, Bandits and Recommender Systems, First International Workshop on Machine Learning, Optimization, and Big Data, 2015.
DOI : 10.1007/978-3-319-27926-8_29
URL : https://hal.archives-ouvertes.fr/hal-01256033

. Mcpherson, L. Miller, J. Smith-lovin, and . Cook, Birds of a Feather: Homophily in Social Networks, Annual Review of Sociology, vol.27, issue.1, pp.415-444, 2001.
DOI : 10.1146/annurev.soc.27.1.415

N. Megiddo, Optimal flows in networks with multiple sources and sinks, Mathematical Programming, vol.9, issue.1, pp.97-107, 1974.
DOI : 10.1007/BF01585506

R. Munos, Optimistic optimization of deterministic functions without the knowledge of its smoothness, Neural Information Processing Systems, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00830143

S. K. Narang, A. Gadde, and A. Ortega, Signal processing techniques for interpolation in graph structured data, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013.
DOI : 10.1109/ICASSP.2013.6638704

H. Narasimhan, D. C. Parkes, and Y. Singer, Learnability of influence in networks, Neural Information Processing Systems, 2015.

G. Neu, Explore no more: Improved high-probability regret bounds for non-stochastic bandits, Neural Information Processing Systems, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01223501

G. Neu and G. Bartók, An Efficient Algorithm for Learning with Semi-bandit Feedback, Algorithmic Learning Theory, 2013.
DOI : 10.1007/978-3-642-40935-6_17

C. Papadimitriou and K. Steiglitz, Combinatorial Optimization, 1998.

P. Preux, R. Munos, and M. Valko, Bandits attack function optimization, 2014 IEEE Congress on Evolutionary Computation (CEC), p.69, 2014.
DOI : 10.1109/CEC.2014.6900558
URL : https://hal.archives-ouvertes.fr/hal-00978637

N. Prisadnikov, Exploration-exploitation trade-offs via probabilistic matrix factorization, pp.10-3929, 2014.

A. Rudi, R. Camoriano, and L. Rosasco, Less is more: Nyström computational regularization, Neural Information Processing Systems, 2015.

S. Samothrakis, D. Perez, and S. Lucas, Training gradient boosting machines using curve-fitting and information-theoretic features for causal direction detection, NIPS Workshop on Causality, 2013.

B. Schölkopf and A. J. Smola, Learning with kernels: Support vector machines, regularization, optimization, and beyond, 2001.

Y. Seldin, P. Bartlett, K. Crammer, and Y. Abbasi-yadkori, Prediction with limited advice and multiarmed bandits with paid observations, International Conference on Machine Learning, 2014.

J. Shawe-taylor and N. Cristianini, Kernel methods for pattern analysis, pp.49-53, 2004.
DOI : 10.1017/CBO9780511809682

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre et al., Mastering the game of Go with deep neural networks and tree search, Nature, vol.34, issue.7587, pp.529-7587, 2016.
DOI : 10.1038/nature16961

A. Singla, E. Horvitz, P. Kohli, R. White, and A. Krause, Information gathering in networks via active exploration, In: International Joint Conferences on Artificial Intelligence, 2015.

A. Slivkins, Multi-armed bandits on implicit metric spaces, Neural Information Processing Systems, p.64, 2011.

N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger, Gaussian process optimization in the bandit setting: No regret and experimental design, International Conference on Machine Learning, pp.47-52, 2010.

. Szörényi, G. Balázs, R. Kedenburg, and . Munos, Optimistic planning in Markov decision processes using a generative model, Neural Information Processing Systems, 2014.

W. R. Thompson, ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES, Biometrika, vol.25, issue.3-4, pp.285-294, 1933.
DOI : 10.1093/biomet/25.3-4.285

S. Tu and J. Zhu, A bandit method using probabilistic matrix factorization in recommendation, Journal of Shanghai Jiaotong University (Science), vol.14, issue.5, pp.535-539, 2015.
DOI : 10.1007/s12204-015-1618-7

M. Valko, A. Carpentier, and R. Munos, Stochastic simultaneous optimistic optimization, International Conference on Machine Learning, p.67, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00789606

M. Valko, N. Korda, and R. Munos, Finite- Time Analysis of Kernelised Contextual Bandits, Ilias Flaounas, and Nelo Cristianini, pp.47-52, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00826946

M. Valko, R. Munos, B. Kveton, and T. Kocák, Spectral bandits for smooth graph functions, International Conference on Machine Learning, pp.18-39, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00986818

S. Vaswani, V. S. Laks, and . Lakshmanan, Adaptive influence maximization in social networks: Why commit when you can adapt? Technical report, 2016.

S. Vaswani, L. V. Lakshmanan, and M. Schmidt, Influence maximization with bandits, NIPS workshop on Networks in the Social and Information Sciences 2015, 2015.

Y. Wang, J. Audibert, and R. Munos, Algorithms for infinitely many-armed bandits, Neural Information Processing Systems, pp.71-75, 2008.

D. J. Watts, H. Steven, and . Strogatz, Collective dynamics of small-world networks, Nature, vol.393, issue.6684, pp.440-442, 1998.
DOI : 10.1038/30918

Z. Wen, B. Kveton, B. Eriksson, and S. Bhamidipati, Sequential Bayesian search, International Conference on Machine Learning, 2013.

Z. Wen, B. Kveton, and M. Valko, Influence maximization with semi-bandit feedback, 2016.

H. Whitney, On the abstract properties of linear dependence, American Journal of Mathematics, vol.573, pp.509-533, 1935.

Y. Wu, A. György, and C. Szepesvári, Online learning with Gaussian payoffs and side observations, Neural Information Processing Systems, pp.31-34, 2015.

J. Yu, S. Yuan, and . Mannor, Unimodal bandits, International Conference on Machine Learning, p.21, 2011.

Y. Yue and C. Guestrin, Linear submodular bandits and their application to diversified retrieval, Neural Information Processing Systems, 2011.

F. Zhang, The Schur complement and its applications, 2005.
DOI : 10.1007/b105056

X. Zhu, Semi-supervised learning literature survey, 2008.