Incremental Policy Generation for Finite-Horizon DEC-POMDPs, ICAPS, 2009. ,
, Policy Search by Dynamic Programming. In NIPS, 2004.
, , 1957.
The Complexity of Decentralized Control of Markov Decision Processes, UAI, 2000. ,
DOI : 10.1287/moor.27.4.819.297
On the Study of Cooperative Multi-Agent Policy Gradient, URL, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01821677
Iterative Solutions of Games by Fictitious Play, Activity Analysis of Production and Allocation, 1951. ,
Learning to Act in Decentralized Partially Observable MDPs Research report , INRIA Grenoble -Rhone-Alpes -CHROMA Team, URL, 2018. ,
Pointbased incremental pruning heuristic for solving finitehorizon DEC-POMDPs, AAMAS, pp.569-576, 2009. ,
Optimally Solving Dec-POMDPs as Continuous-State MDPs, IJCAI, pp.90-96, 2013. ,
DOI : 10.1613/jair.4623
URL : https://hal.archives-ouvertes.fr/hal-00907338
Optimally Solving Dec-POMDPs as Continuous-State MDPs, Journal of Artificial Intelligence Research, vol.55, 2014. ,
DOI : 10.1613/jair.4623
URL : https://hal.archives-ouvertes.fr/hal-00907338
Error-Bounded Approximations for Infinite-Horizon Discounted Decentralized POMDPs, ECML, pp.338-353, 2014. ,
DOI : 10.1007/978-3-662-44848-9_22
URL : https://hal.archives-ouvertes.fr/hal-01096610
Structural Results for Cooperative Decentralized Control Models, IJCAI, pp.46-52, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01188481
Optimally Solving Dec-POMDPs as Continuous-State MDPs, Journal of Artificial Intelligence Research, vol.55, 2016. ,
DOI : 10.1613/jair.4623
URL : https://hal.archives-ouvertes.fr/hal-00907338
Counterfactual Multi-Agent Policy Gradients, 2017. ,
Coordinated Reinforcement Learning, ICML, 2002. ,
Dynamic Programming for Partially Observable Stochastic Games, AAAI, 2004. ,
Dynamic Programming and Markov Processes, 1960. ,
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm, ICML, 1998. ,
Sparse cooperative Q-learning, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015410
Multi-agent reinforcement learning as a rehearsal for decentralized planning, Neurocomputing, vol.190, 2016. ,
DOI : 10.1016/j.neucom.2016.01.031
Constraint-based dynamic programming for decentralized POMDPs with structured interactions, AAMAS, 2009. ,
Markov games as a framework for multiagent reinforcement learning, ICML, 1994. ,
, Stickbreaking policy learning in Dec-POMDPs. In IJCAI. AAAI, 2015.
Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments, AAAI, 2016. ,
Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs, NIPS, 2013. ,
When Evolving Populations is Better Than Coevolving Individuals: The Blind Mice Problem, IJCAI, 2003. ,
Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, p.518, 2015. ,
DOI : 10.1016/S0004-3702(98)00023-X
, Learning to Act in Decentralized Partially Observable MDPs Mordatch, I. and Abbeel, P. Emergence of Grounded Compositional Language in Multi-Agent Populations, 1703.
Optimal Control Strategies in Delayed Sharing Information Structures. Automatic Control, IEEE Transactions on, issue.7, p.56, 2011. ,
Policy gradient with value function approximation for collective multiagent planning, NIPS, pp.4319-4329 ,
Sufficient Plan-Time Statistics for Decentralized POMDPs, IJCAI, 2013. ,
Optimal and Approximate Q-value Functions for Decentralized POMDPs, Journal of Artificial Intelligence Research, vol.32, 2008. ,
DOI : 10.1613/jair.2447
URL : http://orbilu.uni.lu/bitstream/10993/11026/1/live-2447-3856-jair.pdf
Heuristic search for identical payoff Bayesian games, AAMAS, pp.1115-1122, 2010. ,
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs, Journal of Artificial Intelligence Research, vol.46, 2013. ,
DOI : 10.1613/jair.3804
URL : https://jair.org/index.php/jair/article/download/10806/25794
Cooperative Multi-Agent Learning: The State of the Art, Autonomous Agents and Multi-Agent Systems, vol.4, issue.2-3, 2005. ,
DOI : 10.1007/3-540-60923-7_20
Learning to Cooperate via Policy Search, UAI, 2000. ,
Markov Decision Processes, Discrete Stochastic Dynamic Programming, 1994. ,
Team Decision Problems, The Annals of Mathematical Statistics, vol.33, issue.3, 1962. ,
DOI : 10.1214/aoms/1177704455
URL : https://doi.org/10.1214/aoms/1177704455
A stochastic approximation method. The annals of mathematical statistics, 1951. ,
On-line Q-learning using connectionist systems, 1994. ,
Learning Team Strategies: Soccer Case Studies, ML, vol.33, issue.2-3, 1998. ,
A survey of pointbased POMDP solvers, Journal of Autonomous Agents and Multi-Agent Systems, vol.27, issue.1, p.2013 ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
A Heuristic Search Algorithm for Solving Decentralized POMDPs, UAI, 2005. ,
URL : https://hal.archives-ouvertes.fr/inria-00000204
Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, Readings in Agents, 1998. ,
DOI : 10.1016/B978-1-55860-307-3.50049-6
, , 1992.
Monte-Carlo Expectation Maximization for Decentralized POMDPs, IJCAI, 2013. ,
Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs, AAAI, 2011. ,