Nearoptimal BRL using optimistic local transitions (extended version), 2012. ,

A Bayesian sampling approach to exploration in reinforcement learning, Proc. of UAI, 2009. ,

R-max -a general polynomial time algorithm for near-optimal reinforcement learning, JMLR, vol.3, pp.213-231, 2003. ,

Optimal learning: Computational procedures for Bayes-adaptive Markov decision processes, 2002. ,

Near-optimal reinforcement learning in polynomial time, Machine Learning, pp.260-268, 1998. ,

Near-Bayesian exploration in polynomial time, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009. ,

DOI : 10.1145/1553374.1553441

An analytic solution to discrete Bayesian reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006. ,

DOI : 10.1145/1143844.1143932

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1774

Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,

DOI : 10.1002/9780470316887

Variance-based rewards for approximate Bayesian reinforcement learning, Proc. of UAI, 2010. ,

Reinforcement learning in finite MDPs: PAC analysis, JMLR, vol.10, pp.2413-2444, 2009. ,

A Bayesian framework for reinforcement learning, Proc. of ICML, 2000. ,

Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,

DOI : 10.1109/TNN.1998.712192

Model-based reinforcement learning with nearly tight exploration complexity bounds, Proc. of ICML, 2010. ,

A theory of the learnable, Proc. of STOC, 1984. ,

Exploring compact reinforcement-learning representations with linear regression, Proc. of UAI, 2009. ,