Monte-Carlo Search for an Equilibrium in Dec-POMDPs

Decentralized partially observable Markov decision processes (Dec-POMDPs) formalize the problem of designing individual controllers for a group of collaborative agents under stochastic dynamics and partial observability. Seeking a global optimum is difficult (NEXP complete), but seeking a Nash equilibrium -- each agent policy being a best response to the other agents -- is more accessible, and allowed addressing infinite-horizon problems with solutions in the form of finite state controllers. In this paper, we show that this approach can be adapted to cases where only a generative model (a simulator) of the Dec-POMDP is available. This requires relying on a simulation-based POMDP solver to construct an agent's FSC node by node. A related process is used to heuristically derive initial FSCs. Experiment with benchmarks shows that MC-JESP is competitive with exisiting Dec-POMDP solvers, even better than many offline methods using explicit models.

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

MCJESP.pdf (582.71 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Francis Colas : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04191493

Soumis le : mercredi 30 août 2023-14:52:09

Dernière modification le : mardi 13 février 2024-15:19:08

Dates et versions

hal-04191493 , version 1 (30-08-2023)

Licence

Paternité

Identifiants

HAL Id : hal-04191493 , version 1
ARXIV : 2305.11811
DOI : 10.48550/arXiv.2305.11811

Citer

Yang You, Vincent Thomas, Francis Colas, Olivier Buffet. Monte-Carlo Search for an Equilibrium in Dec-POMDPs. The 39th Conference on Uncertainty in Artificial Intelligence (UAI), Jul 2023, Pittsburgh, PA, United States. ⟨10.48550/arXiv.2305.11811⟩. ⟨hal-04191493⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LORIA-AIS ANR

27 Consultations

31 Téléchargements