J. D. Abernethy, E. Hazan, and A. Rakhlin, Interior-point methods for full-information and bandit online learning, IEEE Transactions on Information Theory, vol.58, issue.7, pp.4164-4175, 2012.

C. Allenberg, P. Auer, L. Györfi, and G. Ottucsák, Hannan consistency in on-line learning in case of unbounded losses under partial monitoring, International Conference on Algorithmic Learning Theory, pp.229-243, 2006.

O. Anava, E. Hazan, and S. Mannor, Online learning for adversaries with memory: price of past mistakes, Advances in Neural Information Processing Systems, pp.784-792, 2015.

R. Arora, O. Dekel, and A. Tewari, Online bandit learning against an adaptive adversary: from regret to policy regret, Proc. 29th ICML, 2012.

P. Auer, N. Cesa-bianchi, Y. Freund, and R. E. Schapire, The nonstochastic multiarmed bandit problem, SIAM Journal on Computing, vol.32, issue.1, pp.48-77, 2002.

S. Bubeck, N. Cesa-bianchi, and S. Kakade, Towards minimax policies for online linear optimization with bandit feedback, Annual Conference on Learning Theory, vol.23, pp.41-42, 2012.

N. Cesa, -. Bianchi, and G. Lugosi, Combinatorial bandits, Journal of Computer and System Sciences, vol.78, issue.5, pp.1404-1422, 2012.

N. Cesa-bianchi, C. Gentile, Y. Mansour, and A. Minora, Delay and cooperation in nonstochastic bandits, Conference on Learning Theory, pp.605-622, 2016.

V. Dani, M. Sham, T. Kakade, and . Hayes, The price of bandit information for online optimization, Advances in Neural Information Processing Systems, pp.345-352, 2008.

O. Dekel, J. Ding, T. Koren, and Y. Peres, Online learning with composite loss functions, Conference on Learning Theory, pp.1214-1231, 2014.

O. Dekel, E. Hazan, and T. Koren, The blinded bandit: Learning with adaptive feedback, Advances in Neural Information Processing Systems, pp.1610-1618, 2014.

S. Garrabrant, N. Soares, and J. Taylor, Asymptotic convergence in online learning with unbounded delays, 2016.

E. Hazan, Introduction to online convex optimization, Foundations and Trends R in Optimization, vol.2, issue.3-4, pp.157-325, 2016.

P. Joulani, A. Gyorgy, and C. Szepesvári, Online learning under delayed feedback, International Conference on Machine Learning, pp.1453-1461, 2013.

P. Joulani, A. György, and C. Szepesvári, Delay-tolerant online convex optimization: Unified analysis and adaptive-gradient algorithms, AAAI, vol.16, pp.1744-1750, 2016.

D. Khashabi, K. Quanrud, and A. Taghvaei, Appendix B) it is shown how lower bounds for Gaussian losses can be converted into lower bounds for losses in, Adversarial delays in online stronglyconvex optimization, 2015.