Zap Q-Learning for Optimal Stopping

Shuhang Chen; Adithya Devraj; Ana Bušić; Sean Meyn

doi:10.23919/ACC45564.2020.9147481

Conference Papers Year : 2020

Zap Q-Learning for Optimal Stopping

(1) , (2) , (3, 4) , (2)

1
2
3
4

Shuhang Chen

Function : Author

University of Florida [Gainesville]

Adithya Devraj

Function : Author

Department of Electrical and Computer Engineering [Gainesville]

Ana Bušić

Function : Author
PersonId : 2602
IdHAL : anabusic
ORCID : 0000-0002-4133-3739
IdRef : 144488175

Dynamics of Geometric Networks

Laboratory of Information, Network and Communication Sciences

Sean Meyn

Function : Author

Department of Electrical and Computer Engineering [Gainesville]

Abstract

This paper concerns approximate solutions to the optimal stopping problem for a geometrically ergodic Markov chain on a continuous state space. The starting point is the Galerkin relaxation of the dynamic programming equations that was introduced by Tsitsikilis and Van Roy in the 1990s, which motivated their Q-learning algorithm for optimal stopping. It is known that the convergence rate of Q-learning is in many cases very slow. The reason for slow convergence is explained here, along with a variant of "Zap-Q-learning" algorithm, designed to achieve the optimal rate of convergence. The main contribution is to establish consistency of Zap-Qlearning algorithm for a linear function approximation setting. The theoretical results are illustrated using an example from finance.

Domains

Optimization and Control [math.OC] Electric power Probability [math.PR] Systems and Control [cs.SY]

Ana Busic : Connect in order to contact the contributor

https://hal.science/hal-03094388

Submitted on : Monday, January 4, 2021-12:12:25 PM

Last modification on : Saturday, April 20, 2024-3:09:12 AM

Dates and versions

hal-03094388 , version 1 (04-01-2021)

Identifiers

HAL Id : hal-03094388 , version 1
DOI : 10.23919/ACC45564.2020.9147481

Cite

Shuhang Chen, Adithya Devraj, Ana Bušić, Sean Meyn. Zap Q-Learning for Optimal Stopping. ACC 2020 - American Control Conference, Jul 2020, Denver / Virtual, United States. pp.3920-3925, ⟨10.23919/ACC45564.2020.9147481⟩. ⟨hal-03094388⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL SORBONNE-UNIVERSITE ANR

32 View

0 Download

Zap Q-Learning for Optimal Stopping

Abstract

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share