On Bellman's Optimality Principle for zs-POSGs - Archive ouverte HAL Access content directly
Preprints, Working Papers, ... Year : 2020

On Bellman's Optimality Principle for zs-POSGs

(1) , (2) , (2, 1) , (3) , (1)
1
2
3

Abstract

Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i.e., exploiting the fact that sub-problems are nested recursively within the original problem. Here we show how it can apply to (infinite horizon) 2-player zero-sum partially observable stochastic games (zs-POSGs) by (i) taking a central planner's viewpoint, which can only reason on a sufficient statistic called occupancy state, and (ii) turning such problems into zero-sum occupancy Markov games (zs-OMGs). Then, exploiting the Lipschitz-continuity of the value function in occupancy space, one can derive a version of the HSVI algorithm (Heuristic Search Value Iteration) that provably finds an-Nash equilibrium in finite time.
Fichier principal
Vignette du fichier
2006.16395.pdf (466.94 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03080287 , version 1 (18-12-2020)

Identifiers

Cite

Olivier Buffet, Jilles Dibangoye, Aurélien Delage, Abdallah Saffidine, Vincent Thomas. On Bellman's Optimality Principle for zs-POSGs. 2020. ⟨hal-03080287⟩
57 View
37 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More