The Factored Policy-Gradient Planner

Olivier Buffet 1 Douglas Aberdeen
1 MAIA - Autonomous intelligent machine
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : We present an any-time concurrent probabilistic temporal planner (CPTP) that includes continuous and discrete uncertainties and metric functions. Rather than relying on dynamic programming our approach builds on methods from stochastic local policy search. That is, we optimise a parameterised policy using gradient ascent. The flexibility of this policy-gradient approach, combined with its low memory use, the use of function approximation methods and factorisation of the policy, allow us to tackle complex domains. This Factored Policy Gradient (FPG) planner can optimise steps to goal, the probability of success, or attempt a combination of both. We compare the FPG planner to other planners on CPTP domains, and on simpler but better studied non-concurrent non-temporal probabilistic planning (PP) domains. We present FPG-ipc, the PP version of the planner which has been successful in the probabilistic track of the fifth international planning competition.
Type de document :
Article dans une revue
Artificial Intelligence, Elsevier, 2009, 173 (5-6), pp.722-747. 〈10.1016/j.artint.2008.11.008〉
Liste complète des métadonnées
Contributeur : Olivier Buffet <>
Soumis le : mardi 14 octobre 2008 - 09:04:35
Dernière modification le : jeudi 11 janvier 2018 - 06:19:50

Lien texte intégral




Olivier Buffet, Douglas Aberdeen. The Factored Policy-Gradient Planner. Artificial Intelligence, Elsevier, 2009, 173 (5-6), pp.722-747. 〈10.1016/j.artint.2008.11.008〉. 〈inria-00330031〉



Consultations de la notice