Barycenters in the wasserstein space, SIAM Journal on Mathematical Analysis, vol.43, issue.2, pp.904-924, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00637399
Methods of information geometry, Translations of mathematical monographs, vol.191, 2007. ,
Finite-time analysis of the multiarmed bandit problem, Machine learning, vol.47, issue.2-3, pp.235-256, 2002. ,
Information geometry of covariance matrix: Cartan-siegel homogeneous bounded domains, mostow/berger fibration and frechet median, Matrix Information Geometry, pp.199-255, 2013. ,
A problem in the sequential design of experiments, Sankhy?: The Indian Journal of Statistics, vol.16, issue.3/4, pp.221-229, 1933. ,
, Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory, 1986.
Regret analysis of stochastic and nonstochastic multi-armed bandit problems, Foundations and Trends in Machine Learning, vol.5, issue.1, pp.1-122, 2012. ,
Pure exploration in multi-armed bandits problems, pp.23-37, 2009. ,
, , 2012.
Sanov property, generalized I-projection and a conditional limit theorem, The Annals of Probability, vol.12, issue.3, pp.768-793, 1984. ,
Optimal statistical decisions, Wiley Classics Library, vol.82, 2005. ,
, Probability: theory and examples, 2010.
Adaptive web crawling through structure-based link classification, Proc. ICADL, pp.39-51, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01261960
The KL-UCB algorithm for bounded stochastic bandits and beyond, COLT. pp, pp.359-376, 2011. ,
On explore-then-commit strategies, Advances in Neural Information Processing Systems, vol.29, pp.784-792, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01322906
Methods of information geometry, Translations of mathematical monographs, vol.191, 2007. ,
Best arm identification in multi-armed bandits, COLT. pp, pp.41-53, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00654404
Finite-time analysis of the multiarmed bandit problem, Machine learning, vol.47, issue.2-3, pp.235-256, 2002. ,
A problem in the sequential design of experiments, Sankhy?: The Indian Journal of Statistics, vol.16, issue.3/4, pp.221-229, 1933. ,
Algorithm AS 103: Psi (digamma) function, Journal of the Royal Statistical Society. Series C (Applied Statistics), vol.25, issue.3, pp.315-317, 1976. ,
, Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory, 1986.
Multiple identifications in multi-armed bandits, ICML. pp, pp.258-265, 2013. ,
Pure exploration in multi-armed bandits problems, pp.23-37, 2009. ,
Elements of information theory, 2012. ,
Sanov property, generalized I-projection and a conditional limit theorem, The Annals of Probability, vol.12, issue.3, pp.768-793, 1984. ,
Sample complexity of episodic fixed-horizon reinforcement learning, NIPS. pp, pp.2818-2826, 2015. ,
Sur les lois de probabilites a estimation exhaustive, C. R. Acad. Sci, pp.1265-1266, 1935. ,
Optimal statistical decisions, Wiley Classics Library, vol.82, 2005. ,
Adaptive web crawling through structure-based link classification, Proc. ICADL, pp.39-51, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01261960
The KL-UCB algorithm for bounded stochastic bandits and beyond, COLT. pp, pp.359-376, 2011. ,
Bandit processes and dynamic allocation indices, Journal of the Royal Statistical Society. Series B (Methodological), vol.41, issue.2, pp.148-177, 1979. ,
An asymptotically optimal policy for finite support models in the multiarmed bandit problem, Machine Learning, vol.85, issue.3, pp.361-391, 2011. ,
On bayesian index policies for sequential resource allocation, Annals of Statistics, vol.46, issue.2, pp.842-865, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01251606
On Bayesian upper confidence bounds for bandit problems, AISTATS. pp, pp.592-600, 2012. ,
Information complexity in bandit subset selection, COLT. pp, pp.228-251, 2013. ,
On distributions admitting a sufficient statistic, Transactions of the American Mathematical society, vol.39, issue.3, pp.399-409, 1936. ,
Information theory and statistics, Courier Corporation, 1997. ,
Asymptotic solutions of bandit problems, Stochastic differential systems, stochastic control theory and applications, pp.275-292, 1988. ,
Asymptotically efficient adaptive allocation rules, Adv. Appl. Math, vol.6, issue.1, pp.4-22, 1985. ,
Computing a classic index for finite-horizon bandits, INFORMS Journal on Computing, vol.23, issue.2, pp.254-267, 2011. ,
More) efficient reinforcement learning via posterior sampling, NIPS. pp, pp.3003-3011, 2013. ,
Pure exploration in episodic fixed-horizon Markov decision processes, AAMAS. pp, pp.1703-1704, 2017. ,
Some aspects of the sequential design of experiments, Bull. Amer. Math. Soc, vol.58, issue.5, pp.527-535, 1952. ,
A modern Bayesian look at the multi-armed bandit, Applied Stochastic Models in Business and Industry, vol.26, issue.6, pp.639-658, 2010. ,
On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, vol.25, p.285, 1933. ,