Skip to Main content Skip to Navigation
New interface
Conference papers

Deterministic Policy Gradient Algorithms

Abstract : In this paper we consider deterministic policy gradient algorithms for reinforcement learning with continuous actions. The deterministic pol- icy gradient has a particularly appealing form: it is the expected gradient of the action-value func- tion. This simple form means that the deter- ministic policy gradient can be estimated much more efficiently than the usual stochastic pol- icy gradient. To ensure adequate exploration, we introduce an off-policy actor-critic algorithm that learns a deterministic target policy from an exploratory behaviour policy. We demonstrate that deterministic policy gradient algorithms can significantly outperform their stochastic counter- parts in high-dimensional action spaces.
Document type :
Conference papers
Complete list of metadata

Cited literature [21 references]  Display  Hide  Download
Contributor : Thomas Degris Connect in order to contact the contributor
Submitted on : Wednesday, January 29, 2014 - 6:07:21 PM
Last modification on : Friday, April 1, 2022 - 5:16:34 PM
Long-term archiving on: : Sunday, April 9, 2017 - 2:40:14 AM


Files produced by the author(s)


  • HAL Id : hal-00938992, version 1



David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, et al.. Deterministic Policy Gradient Algorithms. ICML, Jun 2014, Beijing, China. ⟨hal-00938992⟩



Record views


Files downloads