Skip to Main content Skip to Navigation
Journal articles

Heuristic Search Value Iteration for zero-sum Stochastic Games

Olivier Buffet 1 Jilles Dibangoye 2 Abdallah Saffidine 3 Vincent Thomas 1
1 LARSEN - Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
2 CHROMA - Robots coopératifs et adaptés à la présence humaine en environnements dynamiques
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
Abstract : In sequential decision-making, heuristic search algorithms allow exploiting both the initial situation and an admissible heuristic to efficiently search for an optimal solution, often for planning purposes. Such algorithms exist for problems with uncertain dynamics, partial observability, multiple criteria, or multiple collaborating agents. Here we look at two-player zero-sum stochastic games with discounted criterion, in a view to propose a solution tailored to the fully observable case, while solutions have been proposed for particular, though still more general, partially observable cases. This setting induces reasoning on both a lower and an upper bound of the value function, which leads us to proposing zsSG-HSVI, an algorithm based on Heuristic Search Value Iteration (HSVI), and which thus relies on generating trajectories. We demonstrate that, each player acting optimistically, and employing simple heuristic initializations, HSVI's convergence in finite time to an ϵ-optimal solution is preserved. An empirical study of the resulting approach is conducted on benchmark problems of various sizes.
Document type :
Journal articles
Complete list of metadata
Contributor : Olivier Buffet Connect in order to contact the contributor
Submitted on : Thursday, May 27, 2021 - 2:58:22 PM
Last modification on : Friday, January 21, 2022 - 3:11:19 AM
Long-term archiving on: : Saturday, August 28, 2021 - 7:34:16 PM


Files produced by the author(s)



Olivier Buffet, Jilles Dibangoye, Abdallah Saffidine, Vincent Thomas. Heuristic Search Value Iteration for zero-sum Stochastic Games. IEEE Transactions on Games, Institute of Electrical and Electronics Engineers, 2021, 13 (3), pp.1-10. ⟨10.1109/TG.2020.3005214⟩. ⟨hal-03080314⟩



Les métriques sont temporairement indisponibles