Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

Résumé

This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i.e., situations where each agent’s policy is a best response to the other agents’ (fixed) policies. While the Joint Equilibrium-based Search for Policies (JESP) algorithm does this in the finite-horizon setting relying on policy trees, we propose here to adapt it to infinite-horizon Dec-POMDPs by using finite state controller (FSC) policy representations. In this article, we (1) explain how to turn a Dec-POMDP with N − 1 fixed FSCs into an infinite-horizon POMDP whose solution is an N th agent best response; (2) propose a JESP variant, called Inf-JESP, using this to solve infinite-horizon Dec-POMDPs; (3) introduce heuristic initializations for JESP aiming at leading to good solutions; and (4) conduct experiments onstate-of-the-art benchmark problems to evaluate our approach. This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i.e., situations where each agent’s policy is a best response to the other agents’ (fixed) policies. While the Joint Equilibrium-based Search for Policies (JESP) algorithm does this in the finite-horizon setting relying on policy trees, we propose here to adapt it to infinite-horizon Dec-POMDPs by using finite state controller (FSC) policy representations. In this article, we (1) explain how to turn a Dec-POMDP with N − 1 fixed FSCs into an infinite-horizon POMDP whose solution is an N th agent best response; (2) propose a JESP variant, called Inf-JESP, using this to solve infinite-horizon Dec-POMDPs; (3) introduce heuristic initializations for JESP aiming at leading to good solutions; and (4) conduct experiments on state-of-the-art benchmark problems to evaluate our approach

Dates et versions

hal-03523449 , version 1 (12-01-2022)

Identifiants

Citer

Yang You, Vincent Thomas, Francis Colas, Olivier Buffet. Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP. ICTAI 2021 - IEEE 33rd International Conference on Tools with Artificial Intelligence, Nov 2021, Washington/virtual, United States. pp.427-434, ⟨10.1109/ICTAI52525.2021.00069⟩. ⟨hal-03523449⟩
52 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More