C. Amato, D. S. Bernstein, and S. Zilberstein, Solving POMDPs using quadratically constrained linear programs, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems , AAMAS '06, pp.2418-2424, 2007.
DOI : 10.1145/1160633.1160694
URL : http://anytime.cs.umass.edu/aimath06/proceedings/P56.pdf

C. Amato, J. S. Dibangoye, and S. Zilberstein, Incremental policy generation for finitehorizon DEC-POMDPs, Proceedings of the Nineteenth International Conference on Automated Planning and Scheduling, 2009.

R. Aras and A. Dutech, An investigation into mathematical programming for finite horizon decentralized POMDPs, Journal of Artificial Intelligence Research, vol.37, pp.329-396, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00424394

N. Armstrong-crews and M. M. Veloso, An approximate algorithm for solving oracular POMDPs, 2008 IEEE International Conference on Robotics and Automation, pp.3346-3352, 2008.
DOI : 10.1109/ROBOT.2008.4543721

K. J. Aström, Optimal control of Markov processes with incomplete state information, Journal of Mathematical Analysis and Applications, vol.10, issue.1, pp.174-205, 1965.
DOI : 10.1016/0022-247X(65)90154-X

A. G. Barto, S. J. Bradtke, and S. P. Singh, Learning to act using real-time dynamic programming, Artificial Intelligence, vol.72, issue.1-2, pp.81-138, 1995.
DOI : 10.1016/0004-3702(94)00011-O

R. Becker, S. Zilberstein, V. R. Lesser, and C. V. Goldman, Solving transition independent decentralized Markov decision processes, Journal of Artificial Intelligence Research, vol.22, pp.423-455, 2004.
DOI : 10.1145/860575.860583
URL : http://anytime.cs.umass.edu/shlomo/papers/aamas03a.pdf

R. E. Bellman, Dynamic Programming, 1957.

D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, The Complexity of Decentralized Control of Markov Decision Processes, Mathematics of Operations Research, vol.27, issue.4, 2002.
DOI : 10.1287/moor.27.4.819.297

S. Bistarelli, U. Montanari, and F. Rossi, Semiring-based constraint satisfaction and optimization, Journal of the ACM, vol.44, issue.2, pp.201-236, 1997.
DOI : 10.1145/256303.256306
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.3513

A. Boularias and B. Chaib-draa, Exact dynamic programming for decentralized POMDPs with lossless policy compression, Proceedings of the Eighteenth International Conference on Automated Planning and Scheduling, pp.20-27, 2008.

A. Canu and A. Mouaddib, Collective decision under partial observability -a dynamic local interaction model, IJCCI (ECTA-FCTA), pp.146-155, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00969318

C. Martin, . Cooper, M. Simon-de-givry, T. Sanchez, M. Schiex et al., Soft arc consistency revisited, Artificial Intelligence, vol.174, issue.78, pp.449-478, 2010.

D. Pucci, D. Farias, B. Van, and R. , The linear programming approach to approximate dynamic programming, Operations Research, vol.51, issue.6, pp.850-865, 2003.

F. Simon-de-givry, M. Heras, J. Zytnicki, and . Larrosa, Existential arc consistency: Getting closer to full arc consistency in weighted csps, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, pp.84-89, 2005.

R. Dechter, Bucket elimination: a unifying framework for processing hard and soft constraints, ACM Computing Surveys, vol.28, issue.4es, pp.51-55, 1997.
DOI : 10.1145/242224.242302

R. Dechter, Bucket elimination: A unifying framework for reasoning, Artificial Intelligence, vol.113, issue.1-2, pp.41-85, 1999.
DOI : 10.1016/S0004-3702(99)00059-4

R. Dechter, Constraint Optimization, Constraint Processing, pp.363-397, 2003.
DOI : 10.1016/B978-155860890-0/50014-1

R. Dechter and I. Rish, Mini-buckets, Journal of the ACM, vol.50, issue.2, pp.107-153, 2003.
DOI : 10.1145/636865.636866

S. Jilles and . Dibangoye, Abdel-Illah Mouaddib, and Brahim Chaib-draa. Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs, Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, pp.569-576, 2009.

S. Jilles, G. Dibangoye, and . Shani, Brahim Chaib-draa, and Abdel-Illah Mouaddib. Topological order planner for POMDPs, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp.1684-1689, 2009.

S. Jilles, C. Dibangoye, A. Amato, and . Doniec, Scaling up decentralized MDPs through heuristic search, Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp.217-226, 2012.

S. Jilles, C. Dibangoye, and . Amato, Arnaud Doniec, and François Charpillet Producing efficient error-bounded solutions for transition independent decentralized MDPs, Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems, 2013.

S. Jilles, C. Dibangoye, O. Amato, F. Buffet, and . Charpillet, Exploiting separability in multi-agent planning with continuous-state mdps, Proceedings of the Thirteenth International Conference on Autonomous Agents and Multiagent Systems, 2014.

J. W. Grizzle, S. I. Marcus, and K. Hsu, Decentralized control of a multiaccess broadcast network, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes, pp.390-391, 1981.
DOI : 10.1109/CDC.1981.269554

A. Eric, S. Hansen, . Zilberstein, and . Lao-*, A heuristic search algorithm that finds solutions with loops, Artificial Intelligence, vol.129, issue.12, pp.35-62, 2001.

E. A. Hansen, D. S. Bernstein, and S. Zilberstein, Dynamic programming for partially observable stochastic games, Proceedings of the Nineteenth National Conference on Artificial Intelligence, pp.709-715, 2004.

E. Peter, N. J. Hart, B. Nilsson, and . Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Trans. Systems Science and Cybernetics, vol.4, issue.2, pp.100-107, 1968.

M. Hauskrecht, Value-function approximations for partially observable Markov decision processes, Journal of Artificial Intelligence Research, vol.13, pp.33-94, 2000.

M. Jain, M. E. Taylor, M. Tambe, and M. Yokoo, DCOPs meet the real world: Exploring unknown reward matrices with applications to mobile sensor networks, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp.181-186, 2009.

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, Planning and acting in partially observable stochastic domains, Artificial Intelligence, vol.101, issue.1-2, pp.99-134, 1998.
DOI : 10.1016/S0004-3702(98)00023-X

R. E. Korf, Real-time heuristic search, Artificial Intelligence, vol.42, issue.2-3, pp.189-211, 1990.
DOI : 10.1016/0004-3702(90)90054-4

C. R. Kube and H. Zhang, Task Modelling in Collective Robotics, Autonomous Robots, vol.4, issue.1, pp.53-72, 1997.
DOI : 10.1007/978-1-4757-6451-2_3

A. Kumar and S. Zilberstein, Constraint-based dynamic programming for decentralized POMDPs with structured interactions, Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, pp.561-568, 2009.

A. Kumar and S. Zilberstein, Point-based backup for decentralized POMDPs: complexity and new algorithms, Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems, pp.1315-1322, 2010.

W. S. Lovejoy, Computationally Feasible Bounds for Partially Observed Markov Decision Processes, Operations Research, vol.39, issue.1, 1991.
DOI : 10.1287/opre.39.1.162

C. Liam, C. Macdermed, and . Isbell, Point based value iteration with optimal belief compression for Dec-POMDPs, Advances in Neural Information Processing Systems 26, pp.100-108, 2013.

O. Madani, S. Hanks, and A. Condon, On the undecidability of probabilistic planning and related stochastic optimization problems, Artificial Intelligence, vol.147, issue.1-2, pp.5-34, 2003.
DOI : 10.1016/S0004-3702(02)00378-8

S. Francisco, M. M. Melo, and . Veloso, Decentralized MDPs with sparse interactions, Artificial Intelligence, vol.175, issue.11, pp.1757-1789, 2011.

R. Nair, M. Tambe, M. Yokoo, D. V. Pynadath, and S. Marsella, Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp.705-711, 2003.

R. Nair, P. Varakantham, M. Tambe, and M. Yokoo, Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs, Proceedings of the Twentieth National Conference on Artificial Intelligence, pp.133-139, 2005.

A. Frans and . Oliehoek, Sufficient plan-time statistics for decentralized POMDPs, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2013.

A. Frans, . Oliehoek, T. J. Matthijs, and . Spaan, Tree-Based Solution Methods for Multiagent POMDPs with Delayed Communication, Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.

F. A. Oliehoek, T. J. Matthijs, N. A. Spaan, and . Vlassis, Optimal and approximate Q-value functions for decentralized POMDPs, Journal of Artificial Intelligence Research, vol.32, pp.289-353, 2008.

F. A. Oliehoek, S. Whiteson, and M. T. Spaan, Lossless clustering of histories in decentralized POMDPs, Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, pp.577-584, 2009.

F. A. Oliehoek, T. J. Matthijs, C. Spaan, S. Amato, and . Whiteson, Incremental clustering and expansion for faster optimal planning in Dec-POMDPs, Journal of Artificial Intelligence Research, vol.46, pp.449-509, 2013.

M. James, G. W. Ooi, and . Wornell, Decentralized control of a multiple access broadcast channel: Performance bounds, Proc. of the 35th IEEE Conference on Decision and Control, pp.293-298, 1996.

S. Paquet, B. Chaib-draa, P. Dallaire, and D. Bergeron, Task allocation learning in a multiagent environment: Application to the RoboCupRescue simulation, Multiagent and Grid Systems, vol.6, issue.4, pp.293-314, 2010.
DOI : 10.3233/MGS-2010-0153

J. Pearl, Some Recent Results in Heuristic Search Theory, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.6, issue.1, pp.1-13, 1984.
DOI : 10.1109/TPAMI.1984.4767470

J. Pineau, G. J. Gordon, and S. Thrun, Anytime point-based approximations for large POMDPs, Journal of Artificial Intelligence Research, vol.27, pp.335-380, 2006.

W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics), 2007.

L. Matrin and . Puterman, Markov Decision Processes, Discrete Stochastic Dynamic Programming, 1994.

T. Schiex, H. Fargier, and G. Verfaillie, Valued constraint satisfaction problems: Hard and easy problems, Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp.631-639, 1995.

G. Shani, J. Pineau, and R. Kaplow, A survey of point-based POMDP solvers, Autonomous Agents and Multi-Agent Systems, vol.17, issue.2, pp.1-51, 2013.
DOI : 10.1007/s10458-012-9200-2

D. Silver and J. Veness, Monte-carlo planning in large POMDPs, Advances in Neural Information Processing Systems 23, pp.2164-2172, 2010.

D. Richard, E. J. Smallwood, and . Sondik, The optimal control of partially observable Markov decision processes over a finite horizon, Operations Research, vol.21, issue.5, pp.1071-1088, 1973.

T. Smith, Probabilistic Planning for Robotic Exploration The Robotics Institute, 2007.

T. Smith and R. Simmons, Heuristic search value iteration for POMDPs, Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence, pp.520-527, 2004.

T. Smith and R. G. Simmons, Focused real-time dynamic programming for mdps: Squeezing more out of a heuristic, Proceedings of the Twenty-First AAAI Conference on Artificial Intelligence, pp.1227-1232, 2006.

S. Richard, A. G. Sutton, and . Barto, Introduction to Reinforcement Learning, 1998.

I. Suzuki and M. Yamashita, Distributed Anonymous Mobile Robots: Formation of Geometric Patterns, SIAM Journal on Computing, vol.28, issue.4, pp.1347-1363, 1999.
DOI : 10.1137/S009753979628292X

D. Szer, F. Charpillet, and S. Zilberstein, MAA*: A heuristic search algorithm for solving decentralized POMDPs, Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, pp.568-576, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000204

N. John, B. Tsitsiklis, and R. Van, Feature-based methods for large scale dynamic programming, Machine Learning, pp.59-94, 1996.

A. Wald, Contributions to the Theory of Statistical Estimation and Testing Hypotheses, The Annals of Mathematical Statistics, vol.10, issue.4, pp.299-326, 1939.
DOI : 10.1214/aoms/1177732144

K. Winstein, H. Balakrishnan, and H. Kong, TCP ex Machina: Computer-Generated Congestion Control, SIGCOMM, 2013.

N. Lianwen, Z. , and W. Zhang, Speeding up the convergence of value iteration in partially observable Markov decision processes, Journal of Artificial Intelligence Research, vol.14, pp.29-51, 2001.

S. Zilberstein, R. Washington, D. S. Bernstein, and A. Mouaddib, Decisiontheoretic control of planetary rovers, Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents, pp.270-289, 2002.