Incremental Policy Generation for Finite-Horizon DEC- POMDPs, ICAPS, 2009. ,
The Complexity of Decentralized Control of Markov Decision Processes, UAI, 2000. ,
DOI : 10.1287/moor.27.4.819.297
Iterative Solutions of Games by Fictitious Play, Activity Analysis of Production and Allocation, 1951. ,
Chaib-draa. Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs, AAMAS, pp.569-576, 2009. ,
Optimally Solving Dec-POMDPs As Continuousstate MDPs, IJCAI, pp.90-96, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00907338
Optimally solving Dec-POMDPs as Continuous- State MDPs: Theory and Algorithms, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00975802
Exploiting Separability in Multiagent Planning with Continuous-State MDPs (Extended Abstract), IJCAI, pp.4254-4260, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01188483
Structural Results for Cooperative Decentralized Control Models, IJCAI, pp.46-52, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01188481
Optimally Solving Dec-POMDPs as Continuous- State MDPs, Journal of AI Research, vol.55, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-00907338
Counterfactual Multi-Agent Policy Gradients, 2017. ,
Coordinated Reinforcement Learning, ICML, 2002. ,
Dynamic Programming for Partially Observable Stochastic Games, AAAI, 2004. ,
Dynamic Programming and Markov Processes, 1960. ,
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm, ICML, 1998. ,
Sparse cooperative Q-learning, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015410
URL : http://www.aicml.cs.ualberta.ca/banff04/icml/pages/papers/267.pdf
Multi-agent reinforcement learning as a rehearsal for decentralized planning, Neurocomputing, vol.190, 2016. ,
DOI : 10.1016/j.neucom.2016.01.031
URL : https://manuscript.elsevier.com/S0925231216000783/pdf/S0925231216000783.pdf
Constraint-based dynamic programming for decentralized POMDPs with structured interactions, AAMAS, 2009. ,
Markov games as a framework for multi-agent reinforcement learning, ICML, 1994. ,
DOI : 10.1016/B978-1-55860-335-6.50027-1
URL : http://www.ee.duke.edu/~lcarin/emag/seminar_presentations/Markov_Games_Littman.pdf
Stick-breaking policy learning in Dec-POMDPs, IJCAI. AAAI, 2015. ,
Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments, AAAI, 2016. ,
Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs, NIPS, 2013. ,
When Evolving Populations is Better Than Coevolving Individuals: The Blind Mice Problem, IJCAI, 2003. ,
Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, p.518, 2015. ,
DOI : 10.1016/S0004-3702(98)00023-X
Emergence of Grounded Compositional Language in Multi-Agent Populations. CoRR, abs, 1703. ,
Optimal Control Strategies in Delayed Sharing Information Structures. Automatic Control, IEEE Transactions on, issue.7, p.56, 2011. ,
DOI : 10.1109/tac.2010.2089381
URL : http://arxiv.org/pdf/1002.4172
Policy gradient with value function approximation for collective multiagent planning, NIPS, pp.4319-4329 ,
Sufficient Plan-Time Statistics for Decentralized POMDPs, IJCAI, 2013. ,
Optimal and Approximate Q-value Functions for Decentralized POMDPs, Journal of AI Research, vol.32, 2008. ,
DOI : 10.1145/1329125.1329390
URL : http://orbilu.uni.lu/bitstream/10993/11032/1/download.pdf
Heuristic search for identical payoff Bayesian games, AAMAS, pp.1115-1122, 2010. ,
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs, Journal of AI Research, vol.46, 2013. ,
Cooperative Multi-Agent Learning: The State of the Art, Autonomous Agents and Multi-Agent Systems, vol.4, issue.2-3, 2005. ,
DOI : 10.1007/3-540-60923-7_20
Learning to Cooperate via Policy Search, UAI, 2000. ,
Markov Decision Processes, Discrete Stochastic Dynamic Programming, 1994. ,
Team Decision Problems, The Annals of Mathematical Statistics, vol.33, issue.3, 1962. ,
DOI : 10.1214/aoms/1177704455
URL : http://doi.org/10.1214/aoms/1177704455
A stochastic approximation method. The annals of mathematical statistics, 1951. ,
Convex analysis, Princeton Mathematical Series. Princeton, N. J, 1970. ,
DOI : 10.1515/9781400873173
On-line Q-learning using connectionist systems, 1994. ,
Learning Team Strategies: Soccer Case Studies, ML, vol.33, issue.2-3, 1998. ,
A survey of point-based POMDP solvers, Autonomous Agents and Multi-Agent Systems, vol.17, issue.2, p.2013 ,
DOI : 10.1016/j.csl.2006.06.008
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs, UAI, 2005. ,
URL : https://hal.archives-ouvertes.fr/inria-00000204
Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, Readings in Agents, 1998. ,
DOI : 10.1016/B978-1-55860-307-3.50049-6
Monte-Carlo Expectation Maximization for Decentralized POMDPs, IJCAI, 2013. ,
Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs, AAAI Inria RESEARCH CENTRE GRENOBLE ? RHÔNE-ALPES Inovallée 655 avenue de l'Europe Montbonnot 38334 Saint Ismier Cedex Publisher Inria Domaine de Voluceau -Rocquencourt BP 105 -78153 Le Chesnay Cedex inria.fr ISSN, pp.249-6399, 2011. ,