Bisimulation for Markov Decision Processes through Families of Functional Expressions

Norman Ferns 1 Sophia Knight 2 Doina Precup 3
1 ANTIQUE - Analyse Statique par Interprétation Abstraite
DI-ENS - Département d'informatique de l'École normale supérieure, Inria Paris-Rocquencourt
2 COMETE - Concurrency, Mobility and Transactions
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France, Polytechnique - X, CNRS - Centre National de la Recherche Scientifique : UMR7161
Abstract : We transfer a notion of quantitative bisimilarity for labelled Markov processes to Markov decision processes with continuous state spaces. This notion takes the form of a pseudometric on the system states, cast in terms of the equivalence of a family of functional expressions evaluated on those states and interpreted as a real-valued modal logic. Our proof amounts to a slight modification of previous techniques used to prove equivalence with a fixed-point pseudometric on the state-space of a labelled Markov process and making heavy use of the Kantorovich probability metric. Indeed, we again demonstrate equivalence with a fixed-point pseudometric defined on Markov decision processes; what is novel is that we recast this proof in terms of integral probability metrics [5] defined through the family of functional expressions, shifting emphasis back to properties of such families. The hope is that a judicious choice of family might lead to something more computationally tractable than bisimilarity whilst maintaining its pleasing theoretical guarantees. Moreover, we use a trick from descriptive set theory to extend our results to MDPs with bounded measurable reward functions, dropping a previous continuity constraint on rewards and Markov kernels.
Type de document :
Communication dans un congrès
Franck van Breugel; Elham Kashefi; Castucia Palamidessi; Jan Rutten. Horizons of the Mind. A Tribute to Prakash Panangaden (for his 60th birthday), May 2014, Oxford, United Kingdom. Springer, Lecture Notes in Computer Science, 8464, pp.319-342, 2014, Horizons of the Mind. A Tribute to Prakash Panangaden. 〈http://www.springer.com/computer/theoretical+computer+science/book/978-3-319-06879-4〉. 〈10.1007/978-3-319-06880-0_17〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01098566
Contributeur : Jérôme Feret <>
Soumis le : vendredi 26 décembre 2014 - 16:44:06
Dernière modification le : jeudi 11 janvier 2018 - 06:25:39

Identifiants

Collections

Citation

Norman Ferns, Sophia Knight, Doina Precup. Bisimulation for Markov Decision Processes through Families of Functional Expressions. Franck van Breugel; Elham Kashefi; Castucia Palamidessi; Jan Rutten. Horizons of the Mind. A Tribute to Prakash Panangaden (for his 60th birthday), May 2014, Oxford, United Kingdom. Springer, Lecture Notes in Computer Science, 8464, pp.319-342, 2014, Horizons of the Mind. A Tribute to Prakash Panangaden. 〈http://www.springer.com/computer/theoretical+computer+science/book/978-3-319-06879-4〉. 〈10.1007/978-3-319-06880-0_17〉. 〈hal-01098566〉

Partager

Métriques

Consultations de la notice

184