Exploration in modelbased reinforcement learning by empirically estimating learning progress, Neural Information Processing System, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00755248
Reinforcement learning, an introduction, 1998. ,
R-max : a general polynomial time algorithm for near-optimal reinforcement learning, Journal of Machine Learning Research, pp.213-231, 2002. ,
Near-bayesian exploration in polynomial time, Proceeding of the International Conference on Machine Learning (ICML), pp.513-520, 2009. ,
Autonomous exploration for navigating in mdps, Conference Proceedings, 2012. ,
Curious model-building control systems, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks, pp.1458-1463, 1991. ,
DOI : 10.1109/IJCNN.1991.170605
Intrinsic Motivation Systems for Autonomous Mental Development, IEEE Transactions On Evolutionary Computation, 2007. ,
DOI : 10.1109/TEVC.2006.890271
R-IAC: Robust Intrinsically Motivated Exploration and Active Learning, IEEE Transactions on Autonomous Mental Development, 2009. ,
DOI : 10.1109/TAMD.2009.2037513
Finite-time analysis of the multiarmed bandit problem, International Conference on Machine Learning, 2002. ,
Bandit Based Monte-Carlo Planning, European Conference on Machine Learning, 2006. ,
DOI : 10.1007/11871842_29
An analysis of model-based interval estimation for markov decision, Journal of Computer and System Sciences, 2008. ,
A finite-time analysis of multiarmed bandits problems with kullback-leibler divergence, COLT, 2011. ,
Continuous Upper Confidence Trees with Polynomial Exploration ??? Consistency, ECML/PKKD, 2013. ,
DOI : 10.1007/978-3-642-40988-2_13
URL : https://hal.archives-ouvertes.fr/hal-00835352