E. Altman, Flow control using the theory of zerosum Markov games, IEEE Trans. on Auto. Control, issue.39, 1994.
DOI : 10.1109/cdc.1992.371155

R. Bellman, The theory of dynamic programming, Bulletin of the American Mathematical Society, vol.60, issue.6, 1954.

D. Bertsekas and J. Tsitsiklis, , 1996.

B. Bonet and H. Geffner, Faster heuristic search algorithms for planning with uncertainty and full feedback, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJ- CAI'03), 2003.

B. Bonet and H. Geffner, Labeled RTDP : Improving the convergence of real-time dynamic programming, Proceedings of the Thirteenth International Conference on Automated Planning and Scheduling (ICAPS'03), 2003.

B. Bo?anský, V. Lisý, M. Lanctot, J. Cermák, and M. H. ,

. Winands, Algorithms for computing strategies in two-player simultaneous move games, Artificial Intelligence, vol.237, 2016.

V. Bulitko and G. Lee, Learning in Real-Time Search: A Unifying Framework, Journal of Artificial Intelligence Research, vol.25, p.25, 2006.
DOI : 10.1613/jair.1789

J. Dibangoye, C. Amato, O. Buffet, and F. Charpillet, Optimally solving Dec-POMDPs as continuous-state MDPs, Journal of Artificial Intelligence Research, vol.55, 2016.
URL : https://hal.archives-ouvertes.fr/hal-00907338

E. Hansen and S. , Zilberstein : LAO* : A heuristic search algorithm that finds solutions with loops, Artificial Intelligence, vol.129, 2001.

M. G. Lagoudakis and R. Parr, Value function approximation inf zero-sum Markov games, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI'02), 2002.

M. Littman, Markov games as a framework for multi-agent reinforcement learning, Proceedings of the Eleventh International Conference on Machine Learning (ICML'94), 1994.
DOI : 10.1016/B978-1-55860-335-6.50027-1

M. Littman and C. , Szepesvári : A generalized reinforcement learning model : Convergence and applications, Proceedings of the International Conference on Machine Learning (ICML'96), 1996.

H. B. Mcmahan, M. Likhachev, and G. J. Gordon, Bounded real-time dynamic programming, Proceedings of the 22nd international conference on Machine learning , ICML '05, 2005.
DOI : 10.1145/1102351.1102423

C. Meyer, J. Ganascia, and J. Zucker, Learning strategies in games by anticipation, Proceedings of the Fifteenth International Joint Conference on Artificial IntelligenceIJCAI'97, 1997.
URL : https://hal.archives-ouvertes.fr/hal-01649000

J. Nash, Equilibrium points in n-person games, Proceedings of the National Academy of Sciences, 1950.

J. Pérolat, B. Piot, M. Geist, B. Scherrer, and O. Pietquin, Softened approximate policy iteration for Markov games Filar : Algorithms for stochastic games ? a survey, Proceedings of the International Conference on Machine Learning, 1991.

D. M. Roijers, P. Vamplew, S. Whiteson, and R. Dazeley, A survey of multi-objective sequential decisionmaking, Journal of Artificial Intelligence Research, p.48, 2013.

S. M. , Ross : Goofspiel -the game of pure strategy, Journal of Applied Probability, issue.8, 1971.

S. Russell and P. Norvig, Artificial Intelligence : A Modern Approach, 2010.

A. Saffidine, H. Finnsson, and M. Buro, Alpha-Beta pruning for games with simultaneous moves, Proceedings of the 26th AAAI Conference (AAAI), 2012.

L. S. Shapley, Stochastic games, Proceedings of the National Academy of Sciences, p.39, 1953.

L. S. Shapley, 1. Some Topics in Two-Person Games, Annals of Mathematical Studies, vol.5, 1964.
DOI : 10.1515/9781400882014-002

T. Smith, Probabilistic Planning for Robotic Exploration The Robotics Institute, Thèse de doctorat, 2007.

T. Smith and R. Simmons, Point-based POMDP algorithms : Improved analysis and implementation, Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI), 2005.

E. Solal, Stochastic games. Encyclopedia of Database Systems, 2009.

D. Szer, F. Charpillet, and S. Zilberstein, MAA* : A heuristic search algorithm for solving decentralized POMDPs, Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI'05), 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000204

J. Neumann, , 1928.

E. Walraven and M. T. Spaan, Accelerated vector pruning for optimal POMDP solvers, Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017.