E. Levin, R. Pieraccini, and W. Eckert, A stochastic model of human-machine interaction for learning dialog strategies. Speech and Audio Processing, IEEE Transactions on, vol.8, issue.1, pp.11-23, 2000.

V. Rieser and O. Lemon, Reinforcement learning for adaptive dialogue systems, 2011.
DOI : 10.1007/978-3-642-24942-6

O. Pietquin and T. Dutoit, A probabilistic framework for dialog simulation and optimal strategy learning. Audio, Speech, and Language Processing, IEEE Transactions on, vol.14, issue.2, pp.589-599, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00207952

H. Cuayáhuitl, Evaluation of a hierarchical reinforcement learning spoken dialogue system, Computer Speech & Language, vol.24, issue.2, 2009.
DOI : 10.1016/j.csl.2009.07.001

J. D. Williams and S. Young, Partially observable Markov decision processes for spoken dialog systems, Computer Speech & Language, vol.21, issue.2, pp.393-422, 2007.
DOI : 10.1016/j.csl.2006.06.008

T. Paek and R. Pieraccini, Automating spoken dialogue management design using machine learning: An industry perspective, Speech Communication, vol.50, issue.8-9, pp.716-729, 2008.
DOI : 10.1016/j.specom.2008.03.010

A. Y. Ng and S. J. Russell, Algorithms for inverse reinforcement learning, In: Icml, pp.663-670, 2000.

D. Ramachandran and E. Amir, Bayesian inverse reinforcement learning, Urbana, vol.51, p.61801, 2007.

B. Michini and J. P. How, Improving the efficiency of Bayesian inverse reinforcement learning, 2012 IEEE International Conference on Robotics and Automation, pp.3651-3656, 2012.
DOI : 10.1109/ICRA.2012.6225241

M. A. Walker, An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system, Journal of Artificial Intelligence Research, vol.12, pp.387-416, 2000.

S. Young, M. Ga?i´ga?i´c, S. Keizer, F. Mairesse, J. Schatzmann et al., The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management, Computer Speech & Language, vol.24, issue.2, pp.150-174, 2010.
DOI : 10.1016/j.csl.2009.04.001

URL : https://hal.archives-ouvertes.fr/hal-00598186

J. R. Tetreault and D. J. Litman, A Reinforcement Learning approach to evaluating state representations in spoken dialogue systems, Speech Communication, vol.50, issue.8-9, pp.683-696, 2008.
DOI : 10.1016/j.specom.2008.05.002

R. Barahona, L. M. Lorenzo, A. Gardent, and C. , An end-to-end evaluation of two situated dialog systems, Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp.10-19, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00726723

P. Abbeel and A. Y. Ng, Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015430

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.92

P. Abbeel, A. Coates, and A. Y. Ng, Autonomous Helicopter Aerobatics through Apprenticeship Learning, The International Journal of Robotics Research, vol.29, issue.13, pp.1608-1639, 2010.
DOI : 10.1177/0278364910371999

S. Chandramohan, M. Geist, F. Lefevre, and O. Pietquin, User simulation in dialogue systems using inverse reinforcement learning, Proceedings of the 12th Annual Conference of the International Speech Communication Association, pp.1025-1028, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00652446

A. Boularias, H. R. Chinaei, and B. Chaib-draa, Learning the reward model of dialogue pomdps from data, NIPS Workshop on Machine Learning for Assistive Techniques, 2010.

S. Zhifei and E. M. Joo, A review of inverse reinforcement learning theory and recent advances, 2012 IEEE Congress on Evolutionary Computation, pp.1-8, 2012.
DOI : 10.1109/CEC.2012.6256507

L. M. Rojas-barahona, A. Lorenzo, and C. Gardent, Building and exploiting a corpus of dialog interactions between french speaking virtual and human agents, Proceedings of the 8th International Conference on Language Resources and Evaluation, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00726721

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192