. Amato, Optimizing memorybounded controllers for decentralized POMDPs, Proc. of the Twenty-Third Conf. on Uncertainty in Artificial Intelligence (UAI-07), 2007.
DOI : 10.1007/s10458-009-9103-z

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.8934

. Amato, Solving POMDPs using quadratically constrained linear programs, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems , AAMAS '06, 2007.
DOI : 10.1145/1160633.1160694

URL : http://anytime.cs.umass.edu/aimath06/proceedings/P56.pdf

. Amato, Bounded dynamic programming for decentralized POMDPs, Proc. of the Workshop on Multi-Agent Sequential Decision Making in Uncertain Domains (MSDM) in AAMAS'07, 2007.

M. Anderson, B. Anderson, and J. Moore, Time-varying feedback laws for decentralized control, Nineteenth IEEE Conference on Decision and Control including the Symposium on Adaptive Processes, pp.519-524, 1980.
DOI : 10.1109/TAC.1981.1102770

. Becker, Solving transition independent decentralized Markov decision processes, Journal of Artificial Intelligence Research, vol.22, pp.423-455, 2004.

R. Bellman, Dynamic programming, 1957.

. Bernstein, The Complexity of Decentralized Control of Markov Decision Processes, Mathematics of Operations Research, vol.27, issue.4, pp.819-840, 2002.
DOI : 10.1287/moor.27.4.819.297

. Bernstein, Bounded policy iteration for decentralized POMDPs, Proc. of the Nineteenth Int. Joint Conf. on Artificial Intelligence (IJCAI), pp.1287-1292, 2005.

A. Boularias and B. Chaib-draa, Exact dynamic programming for decentralized pomdps with lossless policy compression, Proc. of the Int. Conf. on Automated Planning and Scheduling (ICAPS'08), 2008.

C. Boutilier, Planning, learning and coordination in multiagent decision processes, Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK '96), De Zeeuwse Stromen, 1996.

. Buffet, Shaping multi-agent systems with gradient reinforcement learning, Autonomous Agents and Multi-Agent Systems, vol.33, issue.1, pp.197-220, 2007.
DOI : 10.1007/s10458-006-9010-5

URL : https://hal.archives-ouvertes.fr/inria-00118983

. Chadès, A heuristic approach for solving decentralized-POMDP, Proceedings of the 2002 ACM symposium on Applied computing , SAC '02, pp.57-62, 2002.
DOI : 10.1145/508791.508804

G. B. Dantzig, On the Significance of Solving Linear Programming Problems with Some Integer Variables, Econometrica, vol.28, issue.1, pp.30-44, 1960.
DOI : 10.2307/1905292

U. Diwekar and R. Drenick, Introduction to Applied Optimization Raghav Aras , Alain Dutech Multilinear programming: Duality theories, Journal of Optimization Theory and Applications, vol.72, issue.3, pp.459-486, 1992.

R. Fletcher, Practical Methods of Optimization, 1987.
DOI : 10.1002/9781118723203

M. Ghavamzadeh and S. Mahadevan, Learning to communicate and act in cooperative multiagent systems using hierarchical reinforcement learning, Proc. of the 3rd Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS'04), 2004.

. Govindan, S. Wilson-]-govindan, and R. Wilson, A global Newton method to compute Nash equilibria, Journal of Economic Theory, vol.110, issue.1, pp.65-86, 2001.
DOI : 10.1016/S0022-0531(03)00005-X

. Hansen, Dynamic programming for partially observable stochastic games, Proc. of the Nineteenth National Conference on Artificial Intelligence (AAAI-04), 2004.

P. Horst, R. Horst, and P. Pardalos, Handbook of global optimization, 1995.
DOI : 10.1007/978-1-4615-2025-2

J. , S. Singh, and S. , Learning and discovery of predictive state representations in dynamical systems with reset, Proc. of the Twenty-first Int. Conf. of Machine Learning, 2004.

. Kaelbling, Planning and acting in partially observable stochastic domains, Artificial Intelligence, vol.101, issue.1-2, pp.99-134, 1998.
DOI : 10.1016/S0004-3702(98)00023-X

. Koller, . Megiddo, D. Koller, and N. Megiddo, Finding mixed strategies with small supports in extensive form games, International Journal of Game Theory, vol.18, issue.1, pp.73-92, 1996.
DOI : 10.1007/BF01254386

. Koller, Fast algorithms for finding randomized strategies in game trees, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing , STOC '94, pp.750-759, 1994.
DOI : 10.1145/195058.195451

C. Lemke, Bimatrix Equilibrium Points and Mathematical Programming, Management Science, vol.11, issue.7, pp.681-689, 1965.
DOI : 10.1287/mnsc.11.7.681

D. Luenberger, Linear and Nonlinear Programming, 1984.
DOI : 10.1007/978-3-319-18842-3

B. Mccracken, P. Mccracken, and M. H. Bowling, Online discovery and learning of predictive state representations, Advances in Neural Information Processing Systems 18 (NIPS'05), 2005.

. Nair, Taming decentralized POMDPs: towards efficient policy computation for multiagent setting, Proc. of Int. Joint Conference on Artificial Intelligence, IJCAI'03, 2003.

. Oliehoek, Optimal and approximate Qvalue functions for decentralized POMDPs, Journal of Artificial Intelligence Research (JAIR), vol.32, pp.289-353, 2008.
DOI : 10.1145/1329125.1329390

URL : http://orbilu.uni.lu/handle/10993/11032

R. Osborne, M. J. Osborne, and A. Rubinstein, A Course in Game Theory, 1994.

S. Papadimitriou, C. H. Papadimitriou, and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, 1982.

C. H. Papadimitriou and J. Tsitsiklis, The Complexity of Markov Decision Processes, Mathematics of Operations Research, vol.12, issue.3, pp.441-450, 1987.
DOI : 10.1287/moor.12.3.441

S. Parsons and M. Wooldridge, Game theory and decision theory in multi-agent systems, Autonomous Agents and Multi-Agent Systems, vol.5, issue.3, pp.243-254, 2002.
DOI : 10.1023/A:1015575522401

I. Petrik, . Zilberstein, M. Petrik, and S. Zilberstein, Anytime coordination using separable bilinear programs, Proc. of the National Conference on Artificial Intelligence (AAAI), 2007.

Z. Petrik, M. Petrik, and S. Zilberstein, Average-reward decentralized Markov decision processes, Proc. of the Twentieth Int. Joint Conf. on Artificial Intelligence, 2007.

M. Puterman, Markov Decision Processes: discrete stochastic dynamic programming, 1994.
DOI : 10.1002/9780470316887

T. Pynadath, D. Pynadath, and M. Tambe, The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories And Models, Journal of Artificial Intelligence Research, vol.16, pp.389-423, 2002.

R. Radner, The Application of Linear Programming to Team Decision Problems, Management Science, vol.5, issue.2, pp.143-150, 1959.
DOI : 10.1287/mnsc.5.2.143

R. , N. Russell, S. Norvig, and P. , Artificial Intelligence: A modern approach, 1995.

T. Sandholm, Multiagent systems, chapter Distributed rational decision making, pp.201-258, 1999.

. Sandholm, Mixed-integer programming methods for finding nash equilibria, Proc. of the National Conference on Artificial Intelligence (AAAI), 2005.

F. Charpillet, Cooperative co-learning: A model based approach for solving multi agent reinforcement problems, Proc. of the IEEE Int. Conf. on Tools with Artificial Intelligence (ICTAI'02), 2002.
URL : https://hal.archives-ouvertes.fr/inria-00100814

Z. Seuken, S. Seuken, and S. Zilberstein, Memory-bounded dynamic programming for DEC-POMDPs, Proc. of the Twentieth Int. Joint Conf. on Artificial Intelligence (IJCAI'07), 2007.

. Singh, Learning Without State-Estimation in Partially Observable Markovian Decision Processes, Proceedings of the Eleventh International Conference on Machine Learning, 1994.
DOI : 10.1016/B978-1-55860-335-6.50042-8

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.9833

. Singh, Learning predictive state representations, Proc. of the Twentieth Int. Conf. of Machine Learning (ICML'03), 2003.

C. Szer, D. Szer, and F. Charpillet, Point-based Dynamic Programming for DEC-POMDPs, Proc. of the Twenty-First National Conf. on Artificial Intelligence, 2006.
URL : https://hal.archives-ouvertes.fr/inria-00104443

. Szer, MAA*: A heuristic search algorithm for solving decentralized POMDPs, Proc. of the Twenty-First Conf. on Uncertainty in Artificial Intelligence (UAI'05), pp.576-583, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000204

. Thomas, Interac-DEC-MDP : Towards the use of interactions in DEC-MDP, Proc. of the Third Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS'04), pp.1450-1451, 2004.
URL : https://hal.archives-ouvertes.fr/inria-00108104

R. J. Vanderbei, Linear Programming: Foundations and Extensions, 2008.
DOI : 10.1057/palgrave.jors.2600987

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.111.1824

B. Von-stengel, Handbook of Game Theory Computing equilibria for two-person games, pp.1723-1759, 2002.

D. Wu, J. Wu, and E. H. Durfee, Mixed-integer linear programming for transition-independent decentralized MDPs, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems , AAMAS '06, pp.1058-1060, 2006.
DOI : 10.1145/1160633.1160822

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.105.3094