Interior-point methods for full-information and bandit online learning. Information Theory, IEEE Transactions on, vol.58, issue.7, pp.4164-4175, 2012. ,
DOI : 10.1109/tit.2012.2192096
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.362.4515
Online linear optimization via smoothing, pp.807-823, 2014. ,
Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring, Proceedings of the 17th International Conference on Algorithmic Learning Theory, pp.229-243, 2006. ,
DOI : 10.1007/11894841_20
Minimax policies for adversarial and stochastic bandits, Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00834882
Regret bounds and minimax policies under partial monitoring, Journal of Machine Learning Research, vol.11, pp.2635-2686, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00654356
Regret in Online Combinatorial Optimization, Mathematics of Operations Research, vol.39, issue.1, pp.31-45, 2014. ,
DOI : 10.1287/moor.2013.0598
The Nonstochastic Multiarmed Bandit Problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002. ,
DOI : 10.1137/S0097539701398375
Adaptive and Self-Confident On-Line Learning Algorithms, Journal of Computer and System Sciences, vol.64, issue.1, pp.48-75, 2002. ,
DOI : 10.1006/jcss.2001.1795
Prediction, Learning, and Games, 2006. ,
DOI : 10.1017/CBO9780511546921
Combinatorial bandits, Proceedings of the 22nd Annual Conference on Learning Theory, pp.237-246, 2009. ,
DOI : 10.1016/j.jcss.2012.01.001
Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, pp.1404-1422, 2012. ,
DOI : 10.1016/j.jcss.2012.01.001
Improved second-order bounds for prediction with expert advice, Proceedings of the 18th Annual Conference on Learning Theory (COLT-2005), pp.217-232, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00019799
Combinatorial multi-armed bandit: General framework and applications, Proceedings of the 30th International Conference on Machine Learning Conference Proceedings, pp.151-159, 2013. ,
Prediction by random-walk perturbation, pp.460-473, 2013. ,
Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations, IEEE/ACM Transactions on Networking, vol.20, issue.5, pp.1466-1478, 2012. ,
DOI : 10.1109/TNET.2011.2181864
The on-line shortest path problem under partial monitoring, Journal of Machine Learning Research, vol.8, pp.2369-2403, 2007. ,
Approximation to Bayes risk in repeated play. Contributions to the theory of games, pp.97-139, 1957. ,
Extracting certainty from uncertainty: regret bounded byvariation incosts, Machine Learning, pp.165-188, 2010. ,
DOI : 10.1007/s10994-010-5175-x
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.139.615
Better Algorithms for Benign Bandits, The Journal of Machine Learning Research, vol.12, pp.1287-1311, 2011. ,
DOI : 10.1137/1.9781611973068.5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.139.5763
Prediction with Expert Advice by Following the Perturbed Leader for General Weights, Proceedings of the 15th International Conference on Algorithmic Learning Theory (ALT), pp.279-293, 2004. ,
DOI : 10.1007/978-3-540-30215-5_22
Efficient algorithms for online decision problems, Journal of Computer and System Sciences, vol.71, issue.3, pp.291-307, 2005. ,
DOI : 10.1016/j.jcss.2004.10.016
Efficient learning by implicit exploration in bandit problems with side observations, Ghahramani et, pp.613-621, 2014. ,
Hedging structured concepts, Proceedings of the 23rd Annual Conference on Learning Theory (COLT), pp.93-105, 2010. ,
Tight regret bounds for stochastic combinatorial semi-bandits, AISTATS, 2015. ,
An Efficient Algorithm for Learning with Semi-bandit Feedback, Proceedings of the 24th International Conference on Algorithmic Learning Theory, pp.234-248, 2013. ,
DOI : 10.1007/978-3-642-40935-6_17
FPL Analysis for Adaptive Bandits, 3rd Symposium on Stochastic Algorithms, Foundations and Applications (SAGA'05), pp.58-69, 2005. ,
DOI : 10.1007/11571155_7
Online learning with predictable sequences, pp.993-1019, 2013. ,
Relax and randomize : From value to algorithms, Advances in Neural Information Processing Systems 25, pp.2150-2158 ,
Exploiting easy data in online optimization, pp.810-818, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01079428
Incomplete information and internal regret in prediction of individual sequences, 2005. ,
URL : https://hal.archives-ouvertes.fr/tel-00009759
Follow the leader with dropout perturbations, pp.949-974, 2014. ,