Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration

Résumé

Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue).

Dates et versions

hal-02423846 , version 1 (26-12-2019)

Identifiants

Citer

Marianne Akian, Stéphane Gaubert, Zheng Qu, Omar Saadi. Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration. CDC 2019 - 58th IEEE Conference on Decision and Control, Dec 2019, Nice, France. ⟨hal-02423846⟩
146 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More