Regal: A regularization based algorithm for reinforcement learning in weakly communicating mdps, UAI, pp.35-42, 2009. ,
Bayes-optimal reinforcement learning for discrete uncertainty domains, Abstract. Proceedings of the International Conference on Autonomous Agents and Multiagent System, 2012. ,
How to use expert advice, Journal of the ACM, vol.44, issue.3, pp.427-485, 1997. ,
DOI : 10.1145/258128.258179
The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning, ICML, 2009. ,
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case, European Workshop on Reinforcement Learning, 2008. ,
DOI : 10.1007/978-3-540-89722-4_4
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.380.1439
Probabilistic policy reuse in a reinforcement learning agent, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems , AAMAS '06, pp.720-727, 2006. ,
DOI : 10.1145/1160633.1160762
Near-optimal regret bounds for reinforcement learning, Journal of Machine Learning Research, vol.11, pp.1563-1600, 2010. ,
Optimal regret bounds for selecting the state representation in re inforcement learning, ICML, pp.543-551, 2013. ,
Regret bounds for restless markov bandits, ALT, pp.214-228, 2012. ,
DOI : 10.1007/978-3-642-34106-9_19
URL : https://hal.archives-ouvertes.fr/hal-00765450
An analytic solution to discrete Bayesian reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006. ,
DOI : 10.1145/1143844.1143932
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.126.1774
Exploration-exploitation tradeoffs for experts algorithms in reactive environments, Advances in Neural Information Processing Systems 17, pp.409-416, 2004. ,
Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
An experts algorithm for transfer learning, IJCAI, 2007. ,
Online Learning of Rested and Restless Bandits, IEEE Transactions on Information Theory, vol.58, issue.8, pp.5588-5611, 2012. ,
DOI : 10.1109/TIT.2012.2198613