Distributed delayed stochastic optimization, Advances in Neural Information Processing Systems, pp.873-881, 2011. ,
Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards, 2019. ,
Minimax policies for bandits games, 2009. ,
Finite-time analysis of the multiarmed bandit problem, Machine learning, vol.47, issue.2-3, pp.235-256, 2002. ,
Adaptive and minimax optimal estimation of the tail coefficient, Statistica Sinica, pp.1133-1144, 2015. ,
Nonstochastic bandits with composite anonymous feedback, Conference On Learning Theory, pp.750-773, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01916981
Modeling delayed feedback in display advertising, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.1097-1105, 2014. ,
An empirical evaluation of thompson sampling, Advances in neural information processing systems, pp.2249-2257, 2011. ,
Universal portfolios, The Kelly Capital Growth Investment Criterion: Theory and Practice, pp.181-209, 2011. ,
Extreme value theory: an introduction, 2007. ,
Efficient optimal learning for contextual bandits, 2011. ,
Learning with prolonged delay of reinforcement, Psychonomic Science, vol.5, issue.3, pp.121-122, 1966. ,
, Stochastic bandits with delayed composite anonymous feedback, 2019.
Delaytolerant online convex optimization: Unified analysis and adaptive-gradient algorithms, Thirtieth AAAI Conference on Artificial Intelligence, pp.1453-1461, 2013. ,
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
Slow learners are fast, Proceedings of the 22nd International Conference on Neural Information Processing Systems, pp.2331-2339, 2009. ,
, , 2018.
Bandit algorithms, 2019. ,
,
The queue method: Handling delay, heuristics, prior data, and evaluation in bandits, Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015. ,
Learning from delayed outcomes with intermediate observations, 2018. ,
Delay-tolerant algorithms for asynchronous distributed online learning, Advances in Neural Information Processing Systems, pp.4102-4110, 2014. ,
Online learning with adversarial delays, Advances in neural information processing systems, pp.1270-1278, 2015. ,
, Delay adaptive distributed stochastic convex optimization, 2015.
Nonstochastic multiarmed bandits with unrestricted delays, 2019. ,
Stochastic bandit models for delayed conversions, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01545667
, Contextual bandits under delayed feedback, 2018.
On delayed prediction of individual sequences, IEEE Transactions on Information Theory, vol.48, issue.7, pp.1959-1976, 2002. ,
A nonparametric delayed feedback model for conversion rate prediction, 2018. ,
Learning in generalized linear contextual bandits with stochastic delays, Advances in Neural Information Processing Systems, vol.32, pp.5198-5209, 2019. ,