A. , R. Dutech, A. And-charpillet, and F. , Cooperation in stochastic games through communication, Proc. of the fourth Int. Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS'05), 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000208

C. Boutilier, Planning, learning and coordination in multiagent decision processes The Netherlands, Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK '96), De Zeeuwse Stromen, 1996.

M. Bowling, . And, and M. Veloso, Multiagent learning using a variable learning rate, Artificial Intelligence, vol.136, issue.2, pp.215-250, 2002.
DOI : 10.1016/S0004-3702(02)00121-2
URL : http://doi.org/10.1016/s0004-3702(02)00121-2

C. , A. Littman, M. And-zhang, and N. , Incremental pruning : A simple, fast, exact method for partially observable markov decision processes, Proc. of the Conf. on Uncertainty in Artificial Intelligence (UAI), 1997.

C. , C. And-boutilier, and C. , The dynamics of reinforcement learning in cooperative multiagent systems, pp.746-752, 1998.

G. , A. And-hall, and K. , Correlated Q-learning, Proc. of the 20th Int. Conf. on Machine Learning (ICML), 2003.

H. , E. Bernstein, D. And-zilberstein, and S. , Dynamic programming for partially observable stochastic games, Proc. of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), 2004.

H. , J. And, and M. Wellman, Multiagent reinforcement learning : theoretical framework and an algorithm, Proceedings of the Fifteenth International Conference on Machine Learning, pp.98-242, 1998.

H. , J. And, and M. Wellman, Nash Q-learning for general-sum stochastic games, Journal of Machine Learning Research, 2003.

L. , M. And-szepesvári, and C. , A generalized reinforcement-learning model : Convergence and applications, Proc. of the Thirteenth Int. Conf. on Machine Learning (ICML'96), 1996.