Solving $K$
-MDPs

Jonathan Ferrer-Mestres; Thomas G. Dietterich; Olivier Buffet; Iadine Chadès

Communication Dans Un Congrès Année : 2020

Solving $K$ -MDPs

(1) , (2) , (3) , (1)

1
2
3

Jonathan Ferrer-Mestres

Fonction : Auteur
PersonId : 1086331

Commonwealth Scientific and Industrial Research Organisation [Canberra]

Thomas G. Dietterich

Fonction : Auteur
PersonId : 1086332

Oregon State University

Olivier Buffet

Fonction : Auteur
PersonId : 1407
IdHAL : olivier-buffet
ORCID : 0000-0002-5072-5857

Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment

Iadine Chadès

Fonction : Auteur

Commonwealth Scientific and Industrial Research Organisation [Canberra]

Résumé

Markov Decision Processes (MDPs) are employed to model sequential decision-making problems under uncertainty. Traditionally, algorithms to solve MDPs have focused on solving large state or action spaces. With increasing applications of MDPs to human-operated domains such as conservation of biodiversity and health, developing easy-to-interpret solutions is of paramount importance to increase uptake of MDP policies. Here, we define the problem of solving K-MDPs, i.e., given an original MDP and a constraint on the number of states (K), generate a reduced state space MDP that minimizes the difference between the original optimal MDP value function and the reduced optimal K-MDP value function. Building on existing non-transitive and transitive approximate state abstraction functions, we propose a family of three algorithms based on binary search with sub-optimality bounded polynomially in a precision parameter: ϕQ*εK-MDP-ILP, ϕQ*dK-MDP and ϕa*dK-MDP. We compare these algorithms to a greedy algorithm (ϕQ*ε Greedy K-MDP) and clustering approach (k-means++ K-MDP). On randomly generated MDPs and two computational sustainability MDPs, ϕa*dK-MDP outperformed all algorithms when it could find a feasible solution. While numerous state abstraction problems have been proposed in the literature, this is the first time that the general problem of solving K-MDPs is suggested. We hope that our work will generate future research aiming at increasing the interpretability of MDP policies in human-operated domains.

Domaines

Intelligence artificielle [cs.AI]

Olivier Buffet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03080154

Soumis le : jeudi 17 décembre 2020-22:07:09

Dernière modification le : lundi 11 septembre 2023-17:41:19

Dates et versions

hal-03080154 , version 1 (17-12-2020)

Identifiants

HAL Id : hal-03080154 , version 1

Citer

Jonathan Ferrer-Mestres, Thomas G. Dietterich, Olivier Buffet, Iadine Chadès. Solving $K$ -MDPs. ICAPS 2020 - International Conference on Automated Planning and Scheduling, Oct 2020, Nancy / Virtual, France. ⟨hal-03080154⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LORIA-AIS

36 Consultations

0 Téléchargements