C. Amato, D. S. Bernstein, and S. Zilberstein, Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs, Autonomous Agents and Multi-Agent Systems, vol.24, issue.3, pp.293-320, 2010.
DOI : 10.1007/s10458-009-9103-z

C. Amato, G. Chowdhary, A. Geramifard, N. K. Ure, and M. J. Kochenderfer, Decentralized control of partially observable Markov decision processes, 52nd IEEE Conference on Decision and Control, 2013.
DOI : 10.1109/CDC.2013.6760239

C. Amato, J. S. Dibangoye, and S. Zilberstein, Incremental policy generation for finitehorizon DEC-POMDPs, Proceedings of the Nineteenth International Conference on Automated Planning and Scheduling, 2009.

C. Amato, G. D. Konidaris, A. Anders, G. Cruz, J. P. How et al., Policy search for multi-robot coordination under uncertainty, Proceedings of the Robotics: Science and Systems Conference, 2015.
DOI : 10.1007/11564096_38

C. Amato, G. D. Konidaris, and L. P. Kaelbling, Planning with macro-actions in decentralized POMDPs, Proceedings of the Thirteenth International Conference on Autonomous Agents and Multiagent Systems, 2014.

R. Aras and A. Dutech, An investigation into mathematical programming for finite horizon decentralized POMDPs, Journal of Artificial Intelligence Research, vol.37, pp.329-396, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00439627

B. Banerjee, J. Lyle, L. Kraemer, and R. Yellamraju, Sample bounded distributed reinforcement learning for decentralized POMDPs, Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp.1256-1262, 2012.

A. G. Barto, S. J. Bradtke, and S. P. Singh, Learning to act using real-time dynamic programming, Artificial Intelligence, vol.72, issue.1-2, pp.81-138, 1995.
DOI : 10.1016/0004-3702(94)00011-O

R. Becker, S. Zilberstein, V. R. Lesser, and C. V. Goldman, Solving transition independent decentralized Markov decision processes, Journal of Artificial Intelligence Research, vol.22, pp.423-455, 2004.

R. E. Bellman, Dynamic Programming, 1957.

D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, The Complexity of Decentralized Control of Markov Decision Processes, Mathematics of Operations Research, vol.27, issue.4, 2002.
DOI : 10.1287/moor.27.4.819.297

A. Boularias and B. Chaib-draa, Exact dynamic programming for decentralized POMDPs with lossless policy compression, Proceedings of the Eighteenth International Conference on Automated Planning and Scheduling, pp.20-27, 2008.

A. Canu and A. Mouaddib, Collective decision under partial observability -a dynamic local interaction model, IJCCI (ECTA-FCTA), pp.146-155, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00969318

A. Carlin and S. Zilberstein, Value-based observation compression for DEC-POMDPs, Proceedings of the Seventh International Conference on Autonomous Agents and Multiagent Systems, 2008.

D. Farias, D. P. Van-roy, and B. , The Linear Programming Approach to Approximate Dynamic Programming, Operations Research, vol.51, issue.6, pp.850-865, 2003.
DOI : 10.1287/opre.51.6.850.24925

J. S. Dibangoye, C. Amato, O. Buffet, and F. Charpillet, Exploiting separability in multiagent planning with continuous-state MDPs, Proceedings of the Thirteenth International Conference on Autonomous Agents and Multiagent Systems, pp.1281-1288, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01092066

J. S. Dibangoye, C. Amato, O. Buffet, and F. Charpillet, Exploiting separability in multiagent planning with continuous-state MDPs (extended abstract), Proceedings of the Twenty- Fifth International Joint Conference on Artificial Intelligence, pp.4254-4260, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01188483

J. S. Dibangoye, C. Amato, and A. Doniec, Scaling up decentralized MDPs through heuristic search, Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp.217-226, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00765221

J. S. Dibangoye, C. Amato, A. Doniec, and F. Charpillet, Producing efficient error-bounded solutions for transition independent decentralized MDPs, Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems, pp.539-546, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00918066

J. S. Dibangoye, O. Buffet, and O. Simonin, Structural results for cooperative decentralized control models, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp.46-52, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01188481

J. S. Dibangoye, A. Mouaddib, and B. Chaib-draa, Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs, Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, pp.569-576, 2009.

J. S. Dibangoye, A. Mouaddib, and B. Chaib-draa, Toward error-bounded algorithms for infinite-horizon Dec-POMDPs, Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems, pp.947-954, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00969579

J. S. Dibangoye, G. Shani, B. Chaib-draa, and A. Mouaddib, Topological order planner for POMDPs, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp.1684-1689, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00965737

J. S. Dibangoye, C. Amato, O. Buffet, and F. Charpillet, Optimally solving Dec-POMDPs as continuous-state MDPs, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00907338

J. S. Dibangoye, O. Buffet, and F. Charpillet, Error-bounded approximations for infinitehorizon discounted decentralized POMDPs, Proceedings of the Twenty-Fourth European Conference on Machine Learning, pp.338-353, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01096610

J. S. Dibangoye, B. Chaib-draa, and A. Mouaddib, A novel prioritization technique for solving Markov decision processes, Proceedings of the 21th International Conference of the Florida Artificial Intelligence Research Society, pp.537-542, 2008.

J. W. Grizzle, S. I. Marcus, and K. Hsu, Decentralized control of a multiaccess broadcast network, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes, pp.390-391, 1981.
DOI : 10.1109/CDC.1981.269554

E. A. Hansen, D. S. Bernstein, and S. Zilberstein, Dynamic programming for partially observable stochastic games, Proceedings of the Nineteenth National Conference on Artificial Intelligence, pp.709-715, 2004.

E. A. Hansen and S. Zilberstein, LAO???: A heuristic search algorithm that finds solutions with loops, Artificial Intelligence, vol.129, issue.1-2, pp.35-62, 2001.
DOI : 10.1016/S0004-3702(01)00106-0

P. E. Hart, N. J. Nilsson, and B. Raphael, A Formal Basis for the Heuristic Determination of Minimum Cost Paths, IEEE Transactions on Systems Science and Cybernetics, vol.4, issue.2, pp.100-107, 1968.
DOI : 10.1109/TSSC.1968.300136

M. Hauskrecht, Value-function approximations for partially observable Markov decision processes, Journal of Artificial Intelligence Research, vol.13, pp.33-94, 2000.

R. A. Howard, Dynamic Programming and Markov Processes. The M.I, 1960.

M. Jain, M. E. Taylor, M. Tambe, and M. Yokoo, DCOPs meet the real world: Exploring unknown reward matrices with applications to mobile sensor networks, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp.181-186, 2009.

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, Planning and acting in partially observable stochastic domains, Artificial Intelligence, vol.101, issue.1-2, pp.99-134, 1998.
DOI : 10.1016/S0004-3702(98)00023-X

R. E. Korf, Real-time heuristic search, Artificial Intelligence, vol.42, issue.2-3, pp.189-211, 1990.
DOI : 10.1016/0004-3702(90)90054-4

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.161.809

A. Kumar and S. Zilberstein, Constraint-based dynamic programming for decentralized POMDPs with structured interactions, Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, pp.561-568, 2009.

A. Kumar and S. Zilberstein, Point-based backup for decentralized POMDPs: complexity and new algorithms, Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems, pp.1315-1322, 2010.

R. Nair, M. Tambe, M. Yokoo, D. V. Pynadath, and S. Marsella, Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp.705-711, 2003.

R. Nair, P. Varakantham, M. Tambe, and M. Yokoo, Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs, Proceedings of the Twentieth National Conference on Artificial Intelligence, pp.133-139, 2005.

F. A. Oliehoek, Decentralized POMDPs, Reinforcement Learning: State of the Art, pp.471-503, 2012.
DOI : 10.1007/978-3-642-27645-3_15

F. A. Oliehoek, Sufficient plan-time statistics for decentralized POMDPs, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2013.

F. A. Oliehoek, M. T. Spaan, C. Amato, and S. Whiteson, Incremental clustering and expansion for faster optimal planning in Dec-POMDPs, Journal of Artificial Intelligence Research, vol.46, pp.449-509, 2013.

F. A. Oliehoek, M. T. Spaan, and N. A. Vlassis, Optimal and approximate Q-value functions for decentralized POMDPs, Journal of Artificial Intelligence Research, vol.32, pp.289-353, 2008.
DOI : 10.1145/1329125.1329390

URL : http://orbilu.uni.lu/handle/10993/11032

F. A. Oliehoek and M. T. Spaan, Tree-based solution methods for multiagent POMDPs with delayed communication, Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.

F. A. Oliehoek, S. Whiteson, and M. T. Spaan, Lossless clustering of histories in decentralized POMDPs, Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, pp.577-584, 2009.

J. M. Ooi and G. W. Wornell, Decentralized control of a multiple access broadcast channel: performance bounds, Proceedings of 35th IEEE Conference on Decision and Control, pp.293-298, 1996.
DOI : 10.1109/CDC.1996.574318

J. Pajarinen, A. Hottinen, and J. Peltonen, Optimizing Spatial and Temporal Reuse in Wireless Networks by Decentralized Partially Observable Markov Decision Processes, IEEE Transactions on Mobile Computing, vol.13, issue.4, 2013.
DOI : 10.1109/TMC.2013.39

S. Paquet, B. Chaib-draa, P. Dallaire, and D. Bergeron, Task allocation learning in a multiagent environment: Application to the RoboCupRescue simulation, Multiagent and Grid Systems, vol.6, issue.4, pp.293-314, 2010.
DOI : 10.3233/MGS-2010-0153

J. Pineau, G. J. Gordon, and S. Thrun, Anytime point-based approximations for large POMDPs, Journal of Artificial Intelligence Research, vol.27, pp.335-380, 2006.

W. B. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2007.
DOI : 10.1002/9781118029176

M. L. Puterman, Markov Decision Processes, Discrete Stochastic Dynamic Programming, 1994.

N. Roy, G. J. Gordon, and S. Thrun, Finding approximate POMDP solutions through belief compression, Journal of Artificial Intelligence Research, vol.23, pp.1-40, 2005.

S. Seuken and S. Zilberstein, Improved memory-bounded dynamic programming for DEC- POMDPs, Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intel- ligence, 2007.

G. Shani, J. Pineau, and R. Kaplow, A survey of point-based POMDP solvers, Autonomous Agents and Multi-Agent Systems, vol.17, issue.2, pp.1-51, 2013.
DOI : 10.1007/s10458-012-9200-2

R. D. Smallwood and E. J. Sondik, The Optimal Control of Partially Observable Markov Processes over a Finite Horizon, Operations Research, vol.21, issue.5, pp.1071-1088, 1973.
DOI : 10.1287/opre.21.5.1071

T. Smith, Probabilistic Planning for Robotic Exploration The Robotics Institute, 2007.

T. Smith and R. Simmons, Heuristic search value iteration for POMDPs, Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence, pp.520-527, 2004.

T. Smith and R. G. Simmons, Focused real-time dynamic programming for MDPs: Squeezing more out of a heuristic, Proceedings of the Twenty-First AAAI Conference on Artificial Intelligence, pp.1227-1232, 2006.

M. T. Spaan, F. A. Oliehoek, and C. Amato, Scaling up optimal heuristic search in Dec- POMDPs via incremental expansion, Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp.2027-2032, 2011.

D. Szer, F. Charpillet, and S. Zilberstein, MAA*: A heuristic search algorithm for solving decentralized POMDPs, Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence, pp.568-576, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00000204

J. N. Tsitsiklis and B. Van-roy, Feature-based methods for large scale dynamic programming, Machine Learning, pp.1-3, 1996.

P. Velagapudi, P. Varakantham, K. Sycara, and P. Scerri, Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents, Proceedings of the Tenth International Conference on Autonomous Agents and Multiagent Systems, pp.955-962, 2011.

K. Winstein and H. Balakrishnan, TCP ex Machina: Computer-generated congestion control, 2013.

F. Wu, S. Zilberstein, and X. Chen, Point-based policy generation for decentralized POMDPs, Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems, pp.1307-1314, 2010.

F. Wu, S. Zilberstein, and X. Chen, Online planning for multi-agent systems with bounded communication, Artificial Intelligence, vol.175, issue.2, pp.487-511, 2011.
DOI : 10.1016/j.artint.2010.09.008

S. Zilberstein, R. Washington, D. S. Bernstein, and A. Mouaddib, Decision-Theoretic Control of Planetary Rovers, Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents, pp.270-289, 2002.
DOI : 10.1007/3-540-37724-7_16