Variance estimates and exploration function in multi-armed bandit, 2007. ,
Finite time analysis of the multiarmed bandit problem, Machine Learning, vol.47, issue.2/3, pp.235-256, 2002. ,
DOI : 10.1023/A:1013689704352
Exploration versus exploitation challenge, 2nd PASCAL Challenges Workshop, 2006. ,
Multi-armed Bandit Allocation Indices. Wiley-Interscience series in systems and optimization, 1989. ,
DOI : 10.1002/9780470980033
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
DOI : 10.1016/0196-8858(85)90002-8
URL : http://doi.org/10.1016/0196-8858(85)90002-8
Machine learning and nonparametric bandit theory, IEEE Transactions on Automatic Control, vol.40, pp.1199-1209, 1995. ,
Some aspects of the sequential design of experiments, Bulletin of the American Mathematical Society, vol.58, issue.5, pp.527-535, 1952. ,
DOI : 10.1090/S0002-9904-1952-09620-8
ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES, Biometrika, vol.25, issue.3-4, pp.285-294, 1933. ,
DOI : 10.1093/biomet/25.3-4.285