Analysis of a Classification-based Policy Iteration Algorithm - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Analysis of a Classification-based Policy Iteration Algorithm

Alessandro Lazaric
Mohammad Ghavamzadeh
  • Fonction : Auteur
  • PersonId : 868946
Remi Munos
  • Fonction : Auteur
  • PersonId : 836863

Résumé

We present a classification-based policy iteration algorithm, called Direct Policy Iteration, and provide its finite-sample analysis. Our results state a performance bound in terms of the number of policy improvement steps, the number of rollouts used in each iteration, the capacity of the considered policy space, and a new capacity measure which indicates how well the policy space can approximate policies that are greedy w.r.t. any of its members. The analysis reveals a tradeoff between the estimation and approximation errors in this classification-based policy iteration setting. We also study the consistency of the method when there exists a sequence of policy spaces with increasing capacity.

Domaines

Informatique
Fichier principal
Vignette du fichier
dpi-jmlr.pdf (258.41 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00482065 , version 1 (07-05-2010)
inria-00482065 , version 2 (25-01-2011)
inria-00482065 , version 3 (30-01-2012)

Identifiants

  • HAL Id : inria-00482065 , version 3

Citer

Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos. Analysis of a Classification-based Policy Iteration Algorithm. ICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. pp.607-614. ⟨inria-00482065v3⟩
465 Consultations
621 Téléchargements

Partager

Gmail Facebook X LinkedIn More