Analysis of a Classification-based Policy Iteration Algorithm

Alessandro Lazaric 1 Mohammad Ghavamzadeh 1 Remi Munos 1
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
Abstract : We present a classification-based policy iteration algorithm, called Direct Policy Iteration, and provide its finite-sample analysis. Our results state a performance bound in terms of the number of policy improvement steps, the number of rollouts used in each iteration, the capacity of the considered policy space, and a new capacity measure which indicates how well the policy space can approximate policies that are greedy w.r.t. any of its members. The analysis reveals a tradeoff between the estimation and approximation errors in this classification-based policy iteration setting. We also study the consistency of the method when there exists a sequence of policy spaces with increasing capacity.
Type de document :
Communication dans un congrès
ICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. Omnipress, pp.607-614, 2010
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00482065
Contributeur : Mohammad Ghavamzadeh <>
Soumis le : lundi 30 janvier 2012 - 13:56:45
Dernière modification le : jeudi 11 janvier 2018 - 06:22:13
Document(s) archivé(s) le : mercredi 14 décembre 2016 - 02:29:10

Fichier

dpi-jmlr.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00482065, version 3

Collections

Citation

Alessandro Lazaric, Mohammad Ghavamzadeh, Remi Munos. Analysis of a Classification-based Policy Iteration Algorithm. ICML - 27th International Conference on Machine Learning, Jun 2010, Haifa, Israel. Omnipress, pp.607-614, 2010. 〈inria-00482065v3〉

Partager

Métriques

Consultations de la notice

474

Téléchargements de fichiers

485