All that Glitters is not Gold: Using Landmarks for Reward Shaping in FPG - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

All that Glitters is not Gold: Using Landmarks for Reward Shaping in FPG

Olivier Buffet
Joerg Hoffmann
  • Fonction : Auteur
  • PersonId : 864556

Résumé

Landmarks are facts that must be true at some point in any plan. It has recently been proposed in classical planning to use landmarks for the automatic generation of heuristic functions. We herein apply this idea in probabilistic planning. We focus on the FPG tool, which derives a factored policy based on learning from samples into the state space. The rationale is that FPG's performance can be improved significantly by a trivial heuristic that counts the number of false goals; landmarks provide much better estimates at little overhead cost. We devise improved versions of the classical landmarks heuristic, including a Markovian one that, unlike previous ones, does not depend on the state history. As done previously in FPG for the goal counting, we use the heuristics for reward shaping: the planner gets a positive reward when improving the heuristic value. Based on previous work, we argue that such shaping is policy invariant for Markovian heuristics. Our empirical results confirm that the landmarks heuristics are almost as fast as the goal counting, while delivering much more accurate estimates for initial states. In spite of this, overall planner performance is almost never improved. We discuss some intuitions as to why that is so.
Fichier principal
Vignette du fichier
icaps10-ws.pdf (143.72 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00534375 , version 1 (10-11-2010)

Identifiants

  • HAL Id : inria-00534375 , version 1

Citer

Olivier Buffet, Joerg Hoffmann. All that Glitters is not Gold: Using Landmarks for Reward Shaping in FPG. ICAPS-10 Workshop on Planning and Scheduling Under Uncertainty, May 2010, Toronto, Canada. ⟨inria-00534375⟩
130 Consultations
127 Téléchargements

Partager

Gmail Facebook X LinkedIn More