Planning in entropy-regularized Markov decision processes and games - Archive ouverte HAL Access content directly
Conference Papers Year :

Planning in entropy-regularized Markov decision processes and games

(1, 2) , (1) , (1) , (2, 1) , (2, 1)
1
2
Jean-Bastien Grill
  • Function : Author
  • PersonId : 972490
Pierre Ménard
  • Function : Author
  • PersonId : 1022182
Rémi Munos
  • Function : Author
  • PersonId : 836863
Michal Valko

Abstract

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the environment. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample complexity of order O(1/ε 4) for a desired accuracy ε, whereas for non-regularized settings there are no known algorithms with guaranteed polynomial sample complexity in the worst case.
Fichier principal
Vignette du fichier
smoothcruiser2019.pdf (555.21 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-02387515 , version 1 (29-11-2019)

Identifiers

  • HAL Id : hal-02387515 , version 1

Cite

Jean-Bastien Grill, Omar D Domingues, Pierre Ménard, Rémi Munos, Michal Valko. Planning in entropy-regularized Markov decision processes and games. Neural Information Processing Systems, 2019, Vancouver, Canada. ⟨hal-02387515⟩
72 View
385 Download

Share

Gmail Facebook Twitter LinkedIn More