M. Araya-lópez, V. Thomas, and O. Buffet, Nearoptimal BRL using optimistic local transitions (extended version), 2012.

J. Asmuth, L. Li, M. L. Littman, A. Nouri, and D. Wingate, A Bayesian sampling approach to exploration in reinforcement learning, Proc. of UAI, 2009.

R. I. Brafman and M. Tennenholtz, R-max -a general polynomial time algorithm for near-optimal reinforcement learning, JMLR, vol.3, pp.213-231, 2003.

M. Duff, Optimal learning: Computational procedures for Bayes-adaptive Markov decision processes, 2002.

M. Kearns and S. Singh, Near-optimal reinforcement learning in polynomial time, Machine Learning, pp.260-268, 1998.

J. Kolter and A. Ng, Near-Bayesian exploration in polynomial time, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553441

P. Poupart, N. Vlassis, J. Hoey, R. , and K. , An analytic solution to discrete Bayesian reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006.
DOI : 10.1145/1143844.1143932
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1774

M. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994.
DOI : 10.1002/9780470316887

J. Sorg, S. Singh, L. , and R. , Variance-based rewards for approximate Bayesian reinforcement learning, Proc. of UAI, 2010.

A. L. Strehl, L. Li, and M. L. Littman, Reinforcement learning in finite MDPs: PAC analysis, JMLR, vol.10, pp.2413-2444, 2009.

M. J. Strens, A Bayesian framework for reinforcement learning, Proc. of ICML, 2000.

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

I. Szita and C. Szepesvri, Model-based reinforcement learning with nearly tight exploration complexity bounds, Proc. of ICML, 2010.

L. G. Valiant, A theory of the learnable, Proc. of STOC, 1984.

T. J. Walsh, I. Szita, C. Diuk, and M. L. Littman, Exploring compact reinforcement-learning representations with linear regression, Proc. of UAI, 2009.