Policy Search by Dynamic Programming, Advances in Neural Information Processing Systems 16, 2004. ,
Dynamic Programming, 1957. ,
The Complexity of Decentralized Control of Markov Decision Processes, Proc. of the Sixteenth Conf. on Uncertainty in AI, 2000. ,
DOI : 10.1287/moor.27.4.819.297
Iterative Solutions of Games by Fictitious Play, Activity Analysis of Production and Allocation, 1951. ,
Optimally Solving Dec-POMDPs as Continuous-State MDPs, Journal of AI Research, vol.55, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-00907338
Counterfactual Multi-Agent Policy Gradients, 2017. ,
Coordinated Reinforcement Learning, Proc. of the Eighteenth Int. Conf. on ML, 2002. ,
, Dynamic Programming for Partially Observable Stochastic Games
, Proc. of the Nineteenth National Conf. on AI, 2004.
Dynamic Programming and Markov Processes, 1960. ,
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm, Proc. of the Fifteenth Int. Conf. on ML, 1998. ,
, Sparse Cooperative Q-learning
, Proc. of the Twentieth Int. Conf. on ML, 2004.
Multi-agent reinforcement learning as a rehearsal for decentralized planning, Neurocomputing, vol.190, pp.82-94, 2016. ,
DOI : 10.1016/j.neucom.2016.01.031
Constraint-based dynamic programming for decentralized POMDPs with structured interactions, Proc. of the Eighth Int. Conf. on Autonomous Agents and Multiagent Systems, 2009. ,
Markov games as a framework for multiagent reinforcement learning, Proc. of the Eleventh Int. Conf. on ML, 1994. ,
Stickbreaking policy learning in Dec-POMDPs, Int. Joint Conf. on AI (IJCAI) 2015. AAAI, 2015. ,
Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments, AAAI, 2016. ,
Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs, Advances in Neural Information Processing Systems 26, 2013. ,
When Evolving Populations is Better Than Coevolving Individuals: The Blind Mice Problem, Proc. of the 18th Int. Joint Conf. on AI, IJCAI'03, 2003. ,
Human-level control through deep reinforcement learning, Nature, vol.101, issue.7540, p.518, 2015. ,
DOI : 10.1016/S0004-3702(98)00023-X
Emergence of Grounded Compositional Language in Multi-Agent Populations. CoRR, abs, 1703. ,
Optimal Control Strategies in Delayed Sharing Information Structures. Automatic Control, IEEE Transactions on, issue.7, p.56, 2011. ,
Sufficient Plan-Time Statistics for Decentralized POMDPs, Proc. of the Twenty-Fourth Int. Joint Conf. on AI, 2013. ,
Optimal and Approximate Q-value Functions for Decentralized POMDPs, Journal of Artificial Intelligence Research, vol.32, 2008. ,
DOI : 10.1613/jair.2447
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs, Journal of AI Research, vol.46, 2013. ,
Cooperative Multi-Agent Learning: The State of the Art, Autonomous Agents and Multi-Agent Systems, vol.4, issue.2-3, 2005. ,
DOI : 10.1007/3-540-60923-7_20
Learning to Cooperate via Policy Search, Sixteenth Conf. on Uncertainty in Artificial Intelligence (UAI- 2000), 2000. ,
Point-Based Value Iteration for Continuous POMDPs, J. Mach. Learn. Res, vol.7, 2006. ,
Markov Decision Processes, Discrete Stochastic Dynamic Programming, 1994. ,
Team Decision Problems, The Annals of Mathematical Statistics, vol.33, issue.3, 1962. ,
DOI : 10.1214/aoms/1177704455
A stochastic approximation method. The annals of mathematical statistics, 1951. ,
Convex analysis, Princeton Mathematical Series. Princeton, N. J, 1970. ,
DOI : 10.1515/9781400873173
On-line Q-learning using connectionist systems, 1994. ,
Learning Team Strategies: Soccer Case Studies, ML, vol.33, issue.2-3, 1998. ,
A survey of pointbased POMDP solvers, Journal of Autonomous Agents and Multi-Agent Systems, vol.27, issue.1, p.2013 ,
DOI : 10.1007/s10458-012-9200-2
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs ,
URL : https://hal.archives-ouvertes.fr/inria-00000204
, Proc. of the Twenty-First Conf. on Uncertainty in AI, 2005.
Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, Readings in Agents, 1998. ,
DOI : 10.1016/B978-1-55860-307-3.50049-6
, , 1992.
,
, Proc. of the Twenty-Fourth Int. Joint Conf. on AI, 2013.
,
, Proc. of the Twenty-Fifth AAAI Conf. on AI, 2011.