Bisimulation for Markov Decision Processes through Families of Functional Expressions

Norman Ferns 1 Sophia Knight 2 Doina Precup 3
1 ANTIQUE - Analyse Statique par Interprétation Abstraite
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt
2 COMETE - Concurrency, Mobility and Transactions
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : We transfer a notion of quantitative bisimilarity for labelled Markov processes to Markov decision processes with continuous state spaces. This notion takes the form of a pseudometric on the system states, cast in terms of the equivalence of a family of functional expressions evaluated on those states and interpreted as a real-valued modal logic. Our proof amounts to a slight modification of previous techniques used to prove equivalence with a fixed-point pseudometric on the state-space of a labelled Markov process and making heavy use of the Kantorovich probability metric. Indeed, we again demonstrate equivalence with a fixed-point pseudometric defined on Markov decision processes; what is novel is that we recast this proof in terms of integral probability metrics [5] defined through the family of functional expressions, shifting emphasis back to properties of such families. The hope is that a judicious choice of family might lead to something more computationally tractable than bisimilarity whilst maintaining its pleasing theoretical guarantees. Moreover, we use a trick from descriptive set theory to extend our results to MDPs with bounded measurable reward functions, dropping a previous continuity constraint on rewards and Markov kernels.
Document type :
Conference papers
Liste complète des métadonnées

https://hal.inria.fr/hal-01098566
Contributor : Jérôme Feret <>
Submitted on : Friday, December 26, 2014 - 4:44:06 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:28 PM

Identifiers

Collections

Citation

Norman Ferns, Sophia Knight, Doina Precup. Bisimulation for Markov Decision Processes through Families of Functional Expressions. Horizons of the Mind. A Tribute to Prakash Panangaden (for his 60th birthday), Franck van Breugel; Elham Kashefi; Castucia Palamidessi; Jan Rutten, May 2014, Oxford, United Kingdom. pp.319-342, ⟨10.1007/978-3-319-06880-0_17⟩. ⟨hal-01098566⟩

Share

Metrics

Record views

461