C. Baier and J. P. Katoen, Principles of model checking, 2008.

A. G. Barto, S. J. Bradtke, and S. P. Singh, Learning to act using real-time dynamic programming, Artificial Intelligence, vol.72, issue.1-2, pp.81-138, 1995.
DOI : 10.1016/0004-3702(94)00011-O

T. Brázdil, K. Chatterjee, M. Chmelík, V. Forejt, J. K?etínsk-`-k?etínsk-`-y et al., Verification of Markov Decision Processes Using Learning Algorithms, pp.98-114, 2014.
DOI : 10.1007/978-3-319-11936-6_8

P. Dai, J. G. Mausam, D. S. Weld, and J. Goldsmith, Topological value iteration algorithms, J. Artif. Intell. Res.(JAIR), vol.42, pp.181-209, 2011.

M. Kwiatkowska, D. Parker, and H. Qu, Incremental quantitative verification for Markov decision processes, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), pp.359-370, 2011.
DOI : 10.1109/DSN.2011.5958249
URL : https://hal.archives-ouvertes.fr/hal-00647057

M. L. Puterman, Markov decision processes: discrete stochastic dynamic programming, 1994.
DOI : 10.1002/9780470316887

S. Sanner, R. Goetschalckx, K. Driessens, and G. Shani, Bayesian real-time dynamic programming, In: IJCAI. pp. Citeseer, pp.1784-1789, 2009.

T. Smith and R. Simmons, Focused real-time dynamic programming for mdps: Squeezing more out of a heuristic, In: AAAI. pp, pp.1227-1232, 2006.

D. Wingate and K. D. Seppi, Prioritization methods for accelerating mdp solvers, Journal of Machine Learning Research, pp.851-881, 2005.