Skip to Main content Skip to Navigation
Conference papers

Learning to Act in Decentralized Partially Observable MDPs

Jilles Dibangoye 1 Olivier Buffet 2
1 CHROMA - Robots coopératifs et adaptés à la présence humaine en environnements dynamiques
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
2 LARSEN - Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : We address a long-standing open problem of reinforcement learning in decentralized partially observable Markov decision processes. Previous attempts focussed on different forms of generalized policy iteration, which at best led to local optima. In this paper, we restrict attention to plans, which are simpler to store and update than policies. We derive, under certain conditions, the first near-optimal cooperative multi-agent reinforcement learning algorithm. To achieve significant scalability gains, we replace the greedy maximization by mixed-integer linear programming. Experiments show our approach can learn to act near-optimally in many finite domains from the literature.
Complete list of metadatas

Cited literature [48 references]  Display  Hide  Download

https://hal.inria.fr/hal-01851806
Contributor : Jilles Steeve Dibangoye <>
Submitted on : Monday, July 30, 2018 - 11:51:21 PM
Last modification on : Friday, December 18, 2020 - 9:54:02 AM
Long-term archiving on: : Wednesday, October 31, 2018 - 2:02:29 PM

File

paper.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01851806, version 1

Citation

Jilles Dibangoye, Olivier Buffet. Learning to Act in Decentralized Partially Observable MDPs. ICML 2018 - 35th International Conference on Machine Learning, Jul 2018, Stockholm, Sweden. pp.1233-1242. ⟨hal-01851806⟩

Share

Metrics

Record views

374

Files downloads

129