From Bandits to Experts: A Tale of Domination and Independence, NIPS-25, pp.1610-1618, 2012. ,
Nonstochastic multi-armed bandits with graph-structured feedback. arXiv preprint, 2014. ,
Minimax policies for adversarial and stochastic bandits, Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00834882
Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.11, pp.2785-2836, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00654356
The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002. ,
DOI : 10.1137/S0097539701398375
High-probability regret bounds for bandit online linear optimization, COLT, pp.335-342, 2008. ,
Contextual bandit algorithms with supervised learning guarantees, AISTATS 2011, pp.19-26, 2011. ,
Towards minimax policies for online linear optimization with bandit feedback, 2012. ,
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Foundations and Trends?? in Machine Learning, vol.5, issue.1, 2012. ,
DOI : 10.1561/2200000024
Prediction, Learning, and Games, 2006. ,
DOI : 10.1017/CBO9780511546921
Mirror descent meets fixed share (and feels no regret), NIPS-25, pp.989-997 ,
URL : https://hal.archives-ouvertes.fr/hal-00670514
On tail probabilities for martingales. The Annals of Probability, pp.100-118, 1975. ,
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Journal of Computer and System Sciences, vol.55, issue.1, pp.119-139, 1997. ,
DOI : 10.1006/jcss.1997.1504
Approximation to Bayes risk in repeated play. Contributions to the theory of games, pp.97-139, 1957. ,
Better Algorithms for Benign Bandits, The Journal of Machine Learning Research, vol.12, pp.1287-1311, 2011. ,
DOI : 10.1137/1.9781611973068.5
Volumetric spanners: an efficient exploration basis for learning, COLT, pp.408-422, 2014. ,
Tracking the best expert, Machine Learning, pp.151-178, 1998. ,
Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.71, issue.3, pp.291-307, 2005. ,
DOI : 10.1016/j.jcss.2004.10.016
Efficient learning by implicit exploration in bandit problems with side observations, NIPS-27, pp.613-621, 2014. ,
The Weighted Majority Algorithm, Information and Computation, vol.108, issue.2, pp.212-261, 1994. ,
DOI : 10.1006/inco.1994.1009
From Bandits to Experts: On the Value of Side-Observations, Neural Information Processing Systems, 2011. ,
Tighter bounds for multi-armed bandits with expert advice, COLT, 2009. ,
First-order regret bounds for combinatorial semi-bandits, COLT, pp.1360-1375, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01215001
Online learning with predictable sequences, COLT, pp.993-1019, 2013. ,
PAC-Bayes-Bernstein inequality for martingales and its application to multiarmed bandits, Proceedings of the Workshop on On-line Trading of Exploration and Exploitation 2, 2012. ,
AGGREGATING STRATEGIES, Proceedings of the third annual workshop on Computational learning theory (COLT), pp.371-386, 1990. ,
DOI : 10.1016/B978-1-55860-146-8.50032-1