Optimistic planning in Markov decision processes using a generative model - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Optimistic planning in Markov decision processes using a generative model

Gunnar Kedenburg
  • Fonction : Auteur
  • PersonId : 961176
Rémi Munos
  • Fonction : Auteur
  • PersonId : 836863

Résumé

We consider the problem of online planning in a Markov decision process with discounted rewards for any given initial state. We consider the PAC sample com-plexity problem of computing, with probability 1−δ, an -optimal action using the smallest possible number of calls to the generative model (which provides reward and next-state samples). We design an algorithm, called StOP (for Stochastic-Optimistic Planning), based on the "optimism in the face of uncertainty" princi-ple. StOP can be used in the general setting, requires only a generative model, and enjoys a complexity bound that only depends on the local structure of the MDP.
Fichier principal
Vignette du fichier
StOP_nips.pdf (362.17 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01079366 , version 1 (01-11-2014)

Identifiants

  • HAL Id : hal-01079366 , version 1

Citer

Balázs Szörényi, Gunnar Kedenburg, Rémi Munos. Optimistic planning in Markov decision processes using a generative model. Advances in Neural Information Processing Systems 27, Dec 2014, Montréal, Canada. ⟨hal-01079366⟩
172 Consultations
188 Téléchargements

Partager

Gmail Facebook X LinkedIn More