Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

On Bellman's Optimality Principle for zs-POSGs

Olivier Buffet 1 Jilles Dibangoye 2 Aurélien Delage 2, 1 Abdallah Saffidine 3 Vincent Thomas 1
1 LARSEN - Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
2 CHROMA - Robots coopératifs et adaptés à la présence humaine en environnements dynamiques
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
Abstract : Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i.e., exploiting the fact that sub-problems are nested recursively within the original problem. Here we show how it can apply to (infinite horizon) 2-player zero-sum partially observable stochastic games (zs-POSGs) by (i) taking a central planner's viewpoint, which can only reason on a sufficient statistic called occupancy state, and (ii) turning such problems into zero-sum occupancy Markov games (zs-OMGs). Then, exploiting the Lipschitz-continuity of the value function in occupancy space, one can derive a version of the HSVI algorithm (Heuristic Search Value Iteration) that provably finds an-Nash equilibrium in finite time.
Document type :
Preprints, Working Papers, ...
Complete list of metadata
Contributor : Olivier Buffet Connect in order to contact the contributor
Submitted on : Friday, December 18, 2020 - 10:58:09 AM
Last modification on : Wednesday, November 3, 2021 - 7:08:58 AM
Long-term archiving on: : Friday, March 19, 2021 - 6:14:43 PM


Files produced by the author(s)



Olivier Buffet, Jilles Dibangoye, Aurélien Delage, Abdallah Saffidine, Vincent Thomas. On Bellman's Optimality Principle for zs-POSGs. 2020. ⟨hal-03080287⟩



Les métriques sont temporairement indisponibles