Solving multichain stochastic games with mean payoff by policy iteration

Marianne Akian 1, 2 Jean Cochet-Terrasson 3 Sylvie Detournay 1, 2 Stéphane Gaubert 1, 2
1 MAXPLUS - Max-plus algebras and mathematics of decision
CMAP - Centre de Mathématiques Appliquées - Ecole Polytechnique, Inria Saclay - Ile de France, X - École polytechnique, CNRS - Centre National de la Recherche Scientifique : UMR
Abstract : Zero-sum stochastic games with finite state and action spaces, perfect information, and mean payoff criteria arise in particular from the monotone discretization of mean-payoff pursuit-evasion deterministic differential games. In that case no irreducibility assumption on the Markov chains associated to strategies are satisfied (multichain games). The value of such a game can be characterized by a system of nonlinear equations, involving the mean payoff vector and an auxiliary vector (relative value or bias). Cochet-Terrasson and Gaubert proposed in (C. R. Math. Acad. Sci. Paris, 2006) a policy iteration algorithm relying on a notion of nonlinear spectral projection (Akian and Gaubert, Nonlinear Analysis TMA, 2003), which allows one to avoid cycling in degenerate iterations. We give here a complete presentation of the algorithm, with details of implementation in particular of the nonlinear projection. This has led to the software PIGAMES and allowed us to present numerical results on pursuit-evasion games.
Type de document :
Communication dans un congrès
CDC 2013 - 52nd IEEE Conference on Decision and Control, Dec 2013, Florence, Italy. IEEE, Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, pp.1834-1841, 2013, 〈10.1109/CDC.2013.6760149〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00933689
Contributeur : Marianne Akian <>
Soumis le : lundi 20 janvier 2014 - 21:41:54
Dernière modification le : jeudi 10 mai 2018 - 02:05:04

Identifiants

Collections

Citation

Marianne Akian, Jean Cochet-Terrasson, Sylvie Detournay, Stéphane Gaubert. Solving multichain stochastic games with mean payoff by policy iteration. CDC 2013 - 52nd IEEE Conference on Decision and Control, Dec 2013, Florence, Italy. IEEE, Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, pp.1834-1841, 2013, 〈10.1109/CDC.2013.6760149〉. 〈hal-00933689〉

Partager

Métriques

Consultations de la notice

1019