The Factored Policy-Gradient Planner

Olivier Buffet; Douglas Aberdeen

doi:10.1016/j.artint.2008.11.008

Journal Articles Artificial Intelligence Year : 2009

The Factored Policy-Gradient Planner

(1) ,

Olivier Buffet

Function : Author
PersonId : 1407
IdHAL : olivier-buffet
ORCID : 0000-0002-5072-5857

Autonomous intelligent machine

Douglas Aberdeen

Function : Author
PersonId : 854488

Abstract

We present an any-time concurrent probabilistic temporal planner (CPTP) that includes continuous and discrete uncertainties and metric functions. Rather than relying on dynamic programming our approach builds on methods from stochastic local policy search. That is, we optimise a parameterised policy using gradient ascent. The flexibility of this policy-gradient approach, combined with its low memory use, the use of function approximation methods and factorisation of the policy, allow us to tackle complex domains. This Factored Policy Gradient (FPG) planner can optimise steps to goal, the probability of success, or attempt a combination of both. We compare the FPG planner to other planners on CPTP domains, and on simpler but better studied non-concurrent non-temporal probabilistic planning (PP) domains. We present FPG-ipc, the PP version of the planner which has been successful in the probabilistic track of the fifth international planning competition.

Keywords

Concurrent Probabilistic Temporal Planning Reinforcement Learning Policy-Gradient AI Planning

Domains

Artificial Intelligence [cs.AI] Machine Learning [cs.LG]

Olivier Buffet : Connect in order to contact the contributor

https://inria.hal.science/inria-00330031

Submitted on : Tuesday, October 14, 2008-9:04:35 AM

Last modification on : Thursday, February 15, 2024-3:32:19 AM

Dates and versions

inria-00330031 , version 1 (14-10-2008)

Identifiers

HAL Id : inria-00330031 , version 1
DOI : 10.1016/j.artint.2008.11.008

Cite

Olivier Buffet, Douglas Aberdeen. The Factored Policy-Gradient Planner. Artificial Intelligence, 2009, 173 (5-6), pp.722-747. ⟨10.1016/j.artint.2008.11.008⟩. ⟨inria-00330031⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA UNIV-LORRAINE INRIA2 LORIA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

117 View

0 Download

The Factored Policy-Gradient Planner

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share