Skip to Main content Skip to Navigation
New interface
Conference papers

All that Glitters is not Gold: Using Landmarks for Reward Shaping in FPG

Olivier Buffet 1 Joerg Hoffmann 1 
1 MAIA - Autonomous intelligent machine
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Landmarks are facts that must be true at some point in any plan. It has recently been proposed in classical planning to use landmarks for the automatic generation of heuristic functions. We herein apply this idea in probabilistic planning. We focus on the FPG tool, which derives a factored policy based on learning from samples into the state space. The rationale is that FPG's performance can be improved significantly by a trivial heuristic that counts the number of false goals; landmarks provide much better estimates at little overhead cost. We devise improved versions of the classical landmarks heuristic, including a Markovian one that, unlike previous ones, does not depend on the state history. As done previously in FPG for the goal counting, we use the heuristics for reward shaping: the planner gets a positive reward when improving the heuristic value. Based on previous work, we argue that such shaping is policy invariant for Markovian heuristics. Our empirical results confirm that the landmarks heuristics are almost as fast as the goal counting, while delivering much more accurate estimates for initial states. In spite of this, overall planner performance is almost never improved. We discuss some intuitions as to why that is so.
Document type :
Conference papers
Complete list of metadata

Cited literature [16 references]  Display  Hide  Download
Contributor : Joerg Hoffmann Connect in order to contact the contributor
Submitted on : Wednesday, November 10, 2010 - 9:28:59 AM
Last modification on : Saturday, June 25, 2022 - 7:40:32 PM
Long-term archiving on: : Friday, October 26, 2012 - 3:21:14 PM


Files produced by the author(s)


  • HAL Id : inria-00534375, version 1



Olivier Buffet, Joerg Hoffmann. All that Glitters is not Gold: Using Landmarks for Reward Shaping in FPG. ICAPS-10 Workshop on Planning and Scheduling Under Uncertainty, May 2010, Toronto, Canada. ⟨inria-00534375⟩



Record views


Files downloads