Scaling Up Decentralized MDPs Through Heuristic Search

Jilles Steeve Dibangoye; Amato Christopher; Doniec Arnaud

Communication Dans Un Congrès Année : 2012

Scaling Up Decentralized MDPs Through Heuristic Search

(1) , (2) , (3)

1
2
3

Jilles Steeve Dibangoye

Fonction : Auteur correspondant
PersonId : 4917
IdHAL : jilles-steeve-dibangoye
ORCID : 0000-0001-8826-4438
IdRef : 144368145

Connectez-vous pour contacter l'auteur

Autonomous intelligent machine

Amato Christopher

Fonction : Auteur
PersonId : 934158

Computer Science and Artificial Intelligence Laboratory [Cambridge]

Doniec Arnaud

Fonction : Auteur
PersonId : 182960
IdHAL : arnaud-doniec
ORCID : 0000-0002-3843-6729
IdRef : 11331924X

École des Mines de Douai

Résumé

Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation independent Dec-MDP is a general subclass that has been shown to have complexity in NP, but optimal algorithms for this subclass are still inefﬁcient in practice. In this paper, we ﬁrst provide an updated proof that an optimal policy does not depend on the histories of the agents, but only the local observations. We then present a new algorithm based on heuristic search that is able to expand search nodes by using constraint optimization. We show experimental results comparing our approach with the state-of-the-art DecMDP and Dec-POMDP solvers. These results show a reduction in computation time and an increase in scalability by multiple orders of magnitude in a number of benchmarks.

Les processus décisionnels de Markov répartis constituent un formalisme très riche pour la modélisation des systèmes coopératifs, mais leur complexité reste hors de portée (NEXP-complet). Le formalisme des Dec-MDPs avec transitions et observations indépendantes est une sous classe dont la complexité est NP, mais les algorithmes exacts pour cette classe restent ineficaces. Dans ce papier, nous offrons une nouvelle preuve de l'optimalité des politiques de Markov pour ce formalisme. Puis, nous présentons un nouvel algorithme de recherche heuristique capable explorer l'espace de recherche sur la base des techniques d'optimisation sous contraintes. Nous démontrons enfin que l'algorithme ainsi construit offre les meilleurs performances sur l'ensemble des instances testées.

Domaines

Intelligence artificielle [cs.AI]

Jilles Steeve Dibangoye : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00765221

Soumis le : vendredi 14 décembre 2012-12:20:04

Dernière modification le : lundi 11 septembre 2023-17:41:18

Dates et versions

hal-00765221 , version 1 (14-12-2012)

Identifiants

HAL Id : hal-00765221 , version 1

Citer

Jilles Steeve Dibangoye, Amato Christopher, Doniec Arnaud. Scaling Up Decentralized MDPs Through Heuristic Search. Conference on Uncertainty in Artificial Intelligence, Aug 2012, Catalina, United States. ⟨hal-00765221⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LORIA-AIS IMT-NORD-EUROPE CERI-SN

107 Consultations

0 Téléchargements

Scaling Up Decentralized MDPs Through Heuristic Search

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager