hal-00644935, version 1
Classification-based Policy Iteration with a Critic
International Conference on Machine Learning (ICML) (2011) 1049-1056
- 1:
-
http://www.inria.fr/equipes/sequel
INRIA – CNRS : UMR8146 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales – Ecole Centrale de Lille France - 2:
-
INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL) France
Bibliographic reference
- Type of document: Peer-reviewed conferences/proceedings
- Domain: Statistics/Other Statistics
- Title: Classification-based Policy Iteration with a Critic
- Abstract: In this paper, we study the effect of adding a value function approximation component (critic) to rollout classification-based policy iteration (RCPI) algorithms. The idea is to use a critic to approximate the return after we truncate the rollout trajectories. This allows us to control the bias and variance of the rollout estimates of the action-value function. Therefore, the introduction of a critic can improve the accuracy of the rollout estimates, and as a result, enhance the performance of the RCPI algorithm. We present a new RCPI algorithm, called direct policy iteration with critic (DPI-Critic), and provide its finite-sample analysis when the critic is based on the LSTD method. We empirically evaluate the performance of DPI-Critic and compare it with DPI and LSPI in two benchmark reinforcement learning problems.
- Full text language: English
- Publication date: 2011-06-29
- Audience: international
- Conference title: International Conference on Machine Learning (ICML)
- Conference city: Seattle
- Country: United States
- Conference date: 2011-06-28
- Conference date (end): 2011-07-02
- Commercial editor: ACM
- Volume title : Proceedings of the 28 th International Conference on Machine Learning
- Pagination: 1049-1056
Attached file list to this document:
![]() |
![]() |
dpi-critic.pdf |
- hal-00644935, version 1
- http://hal.inria.fr/hal-00644935
- oai:hal.inria.fr:hal-00644935
- From:
- Submitted on: Friday, 25 November 2011 15:24:52
- Updated on: Saturday, 26 November 2011 09:52:42




Associated documents
Export