Experiments with infinite-horizon, policy-gradient estimation, JAIR, vol.15, pp.351-381, 2001. ,
The factored policy-gradient planner, Artificial Intelligence, vol.173, issue.5-6, pp.5-6722, 2009. ,
DOI : 10.1016/j.artint.2008.11.008
URL : https://hal.archives-ouvertes.fr/inria-00330031
Robot shaping: developing autonomous agents through learning, Artificial Intelligence, vol.71, issue.2, pp.321-370, 1994. ,
DOI : 10.1016/0004-3702(94)90047-7
Landmarks, critical paths and abstractions: What's the difference anyway?, Proc. ICAPS'09, 2009. ,
The FF planning system: Fast plan generation through heuristic search, JAIR, vol.14, pp.253-302, 2001. ,
Ordered landmarks in planning, JAIR, vol.22, pp.215-278, 2004. ,
Cost-optimal planning with landmarks, Proc. IJCAI'09, 2009. ,
Reward functions for accelerated learning, Proc. ICML'94, 1994. ,
Policy invariance under reward transformations: Theory and application to reward shaping, Proc. ICML'99, 1999. ,
Shaping in reinforcement learning by changing the physics of the problem, Proc. ICML'00, 2000. ,
Landmarks revisited, Proc. AAAI'08, pp.975-982, 2008. ,
Reinforcement Learning Algorithms for MDPs, 2009. ,
DOI : 10.1002/9780470400531.eorms0714
Decision-theoretic planning with non-Markovian rewards, JAIR, vol.25, pp.17-74, 2006. ,
Potential-based shaping and Q-value initialization are equivalent, JAIR, vol.19, pp.205-208, 2003. ,
Simple statistical gradient-following algorithms for connectionnist reinforcement learning, pp.229-256, 1992. ,
FF-Replan: a baseline for probabilistic planning, Proc. ICAPS'07, 2007. ,