Boosted Fitted Q-Iteration - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Boosted Fitted Q-Iteration

Résumé

This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems. B-FQI is an iterative off-line algorithm that, given a dataset of transitions, builds an approximation of the optimal action-value function by summing the approximations of the Bell-man residuals across all iterations. The advantage of such approach w.r.t. to other AVI methods is twofold: (1) while keeping the same function space at each iteration, B-FQI can represent more complex functions by considering an additive model; (2) since the Bellman residual decreases as the optimal value function is approached , regression problems become easier as iterations proceed. We study B-FQI both theoretically , providing also a finite-sample error upper bound for it, and empirically, by comparing its performance to the one of FQI in different domains and using different regression techniques.
Fichier principal
Vignette du fichier
tosatto17a.pdf (371.64 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01653332 , version 1 (01-12-2017)

Identifiants

  • HAL Id : hal-01653332 , version 1

Citer

Samuele Tosatto, Matteo Pirotta, Carlo d'Eramo, Marcello Restelli. Boosted Fitted Q-Iteration. 34th International Conference on Machine Learning (ICML), Aug 2017, Sydney, Australia. ⟨hal-01653332⟩
153 Consultations
114 Téléchargements

Partager

Gmail Facebook X LinkedIn More