Optimally Solving Dec-POMDPs as Continuous-State MDPs

Jilles Steeve Dibangoye 1 Christopher Amato 2 Olivier Buffet 3 François Charpillet 4
1 CHROMA - Robots coopératifs et adaptés à la présence humaine en environnements dynamiques
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
4 LARSEN - Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general model for decision-making under uncertainty in decentralized settings, but are difficult to solve optimally (NEXP-Complete). As a new way of solving these problems, we introduce the idea of transforming a Dec-POMDP into a continuous-state deterministic MDP with a piecewise-linear and convex value function. This approach makes use of the fact that planning can be accomplished in a centralized offline manner, while execution can still be decentralized. This new Dec-POMDP formulation , which we call an occupancy MDP, allows powerful POMDP and continuous-state MDP methods to be used for the first time. To provide scalability, we refine this approach by combining heuristic search and compact representations that exploit the structure present in multi-agent domains, without losing the ability to converge to an optimal solution. In particular, we introduce a feature-based heuristic search value iteration (FB-HSVI) algorithm that relies on feature-based compact representations, point-based updates and efficient action selection. A theoretical analysis demonstrates that FB-HSVI terminates in finite time with an optimal solution. We include an extensive empirical analysis using well-known benchmarks, thereby demonstrating that our approach provides significant scalability improvements compared to the state of the art.
Document type :
Journal articles
Complete list of metadatas

Cited literature [66 references]  Display  Hide  Download

https://hal.inria.fr/hal-01279444
Contributor : Jilles Steeve Dibangoye <>
Submitted on : Tuesday, March 1, 2016 - 8:34:11 AM
Last modification on : Thursday, June 6, 2019 - 10:05:28 AM
Long-term archiving on : Tuesday, May 31, 2016 - 10:53:28 AM

File

dibangoye16a.pdf
Publisher files allowed on an open archive

Licence


Public Domain

Identifiers

Citation

Jilles Steeve Dibangoye, Christopher Amato, Olivier Buffet, François Charpillet. Optimally Solving Dec-POMDPs as Continuous-State MDPs. Journal of Artificial Intelligence Research, Association for the Advancement of Artificial Intelligence, 2016, 55, pp.443-497. ⟨http://www.jair.org/⟩. ⟨10.1613/jair.4623⟩. ⟨hal-01279444⟩

Share

Metrics

Record views

853

Files downloads

242