Improved algorithms for linear stochastic bandits, Advances in Neural Information Processing Systems (NIPS), 2011. ,
Contextual bandit learning with predictable rewards, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2012. ,
, Taming the monster: A fast and simple algorithm for contextual bandits, 2014.
A multiworld testing decision service, 2016. ,
Open problem: First-order regret bounds for contextual bandits, Conference on Learning Theory (COLT, 2017. ,
Thompson sampling for contextual bandits with linear payoffs, Proceedings of the International Conference on Machine Learning (ICML), 2013. ,
Mostly exploration-free algorithms for contextual bandits, 2017. ,
Beating the hold-out: Bounds for k-fold and progressive cross-validation, Conference on Learning Theory (COLT), 1999. ,
An empirical evaluation of thompson sampling, Advances in Neural Information Processing Systems (NIPS), 2011. ,
Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research (JMLR), vol.12, pp.2121-2159, 2011. ,
Efficient optimal learning for contextual bandits, Conference on Uncertainty in Artificial Intelligence (UAI), 2011. ,
Doubly robust policy evaluation and learning, Proceedings of the International Conference on Machine Learning (ICML), 2011. ,
, Thompson sampling with the online bootstrap, 2014.
Practical contextual bandits with regression oracles, Proceedings of the International Conference on Machine Learning (ICML), 2018. ,
Theory of disagreement-based active learning. Foundations and Trends in Machine Learning, vol.7, 2014. ,
Practical lessons from predicting clicks on ads at facebook, Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, 2014. ,
Algorithms for active learning, 2010. ,
Efficient and parsimonious agnostic active learning, Advances in Neural Information Processing Systems (NIPS), 2015. ,
Next: A system for real-world development, evaluation, and application of active learning, Advances in Neural Information Processing Systems, 2015. ,
On the generalization ability of online strongly convex programming algorithms, Advances in Neural Information Processing Systems (NIPS), 2009. ,
A smoothed analysis of the greedy algorithm for the linear contextual bandit problem, Advances in Neural Information Processing Systems (NIPS), 2018. ,
Online importance weight aware updates, Conference on Uncertainty in Artificial Intelligence (UAI), 2011. ,
Active learning for cost-sensitive classification, 2017. ,
The epoch-greedy algorithm for multi-armed bandits with side information, Advances in Neural Information Processing Systems (NIPS), 2008. ,
A contextual-bandit approach to personalized news article recommendation, Proceedings of the 19th international conference on World wide web, 2010. ,
Risk bounds for statistical learning, The Annals of Statistics, vol.34, issue.5, 2006. ,
Ad click prediction: a view from the trenches, Proceedings of the 19th ACM international conference on Knowledge discovery and data mining (KDD), 2013. ,
Bootstrapped thompson sampling and deep exploration, 2015. ,
Online bagging and boosting, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2001. ,
Efficient online bootstrapping for large scale learning, Workshop on Parallel and Large-scale Machine Learning (BigLearning@NIPS), 2013. ,
Normalized online learning, Conference on Uncertainty in Artificial Intelligence (UAI), 2013. ,
, A tutorial on thompson sampling, 2017.
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, vol.25, issue.3/4, p.1933 ,