Skip to Main content Skip to Navigation

On Markov Policies For Decentralized POMDPs

Jilles Dibangoye 1
1 CHROMA - Robots coopératifs et adaptés à la présence humaine en environnements dynamiques
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
Abstract : This paper formulates the optimal decentralized control problem for a class of mathematical models in which the system to be controlled is characterized by a finite-state discrete-time Markov process. The states of this internal process are not directly observable by the agents; rather, they have available a set of observable outputs that are only probabilistically related to the internal state of the system. The paper demonstrates that, if there are only a finite number of control intervals remaining, then the optimal payoff function of a Markov policy is a piecewise-linear, convex function of the current observation probabilities of the internal partially observable Markov process. In addition, algorithms for utilizing this property to calculate either the optimal or an error-bounded Markov policy and payoff function for any finite horizon is outlined.
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download
Contributor : Jilles Steeve Dibangoye Connect in order to contact the contributor
Submitted on : Wednesday, August 22, 2018 - 9:21:36 PM
Last modification on : Thursday, January 20, 2022 - 5:26:35 PM
Long-term archiving on: : Friday, November 23, 2018 - 3:45:30 PM


Files produced by the author(s)


  • HAL Id : hal-01860060, version 1


Jilles Dibangoye. On Markov Policies For Decentralized POMDPs. [Research Report] RR-9202, INRIA Grenoble - Rhone-Alpes - CHROMA Team; CITI - CITI Centre of Innovation in Telecommunications and Integration of services; INSA Lyon. 2018. ⟨hal-01860060⟩



Les métriques sont temporairement indisponibles