Nearoptimal BRL using optimistic local transitions (extended version), 2012. ,
A Bayesian sampling approach to exploration in reinforcement learning, Proc. of UAI, 2009. ,
R-max -a general polynomial time algorithm for near-optimal reinforcement learning, JMLR, vol.3, pp.213-231, 2003. ,
Optimal learning: Computational procedures for Bayes-adaptive Markov decision processes, 2002. ,
Near-optimal reinforcement learning in polynomial time, Machine Learning, pp.260-268, 1998. ,
Near-Bayesian exploration in polynomial time, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009. ,
DOI : 10.1145/1553374.1553441
An analytic solution to discrete Bayesian reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006. ,
DOI : 10.1145/1143844.1143932
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1774
Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,
DOI : 10.1002/9780470316887
Variance-based rewards for approximate Bayesian reinforcement learning, Proc. of UAI, 2010. ,
Reinforcement learning in finite MDPs: PAC analysis, JMLR, vol.10, pp.2413-2444, 2009. ,
A Bayesian framework for reinforcement learning, Proc. of ICML, 2000. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Model-based reinforcement learning with nearly tight exploration complexity bounds, Proc. of ICML, 2010. ,
A theory of the learnable, Proc. of STOC, 1984. ,
Exploring compact reinforcement-learning representations with linear regression, Proc. of UAI, 2009. ,