Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Maxim Kaledin; Eric Moulines; Alexey Naumov; Vladislav Tadic; Hoi-To Wai

Communication Dans Un Congrès Année : 2020

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

(1) , (2, 1, 3) , (1) , (4) , (5)

1
2
3
4
5

Maxim Kaledin

Fonction : Auteur
PersonId : 1083745

Vysšaja škola èkonomiki = National Research University Higher School of Economics [Moscow]

Eric Moulines

Fonction : Auteur
PersonId : 1350242
ORCID : 0000-0002-2058-0693
IdRef : 076452476

Centre de Mathématiques Appliquées - Ecole Polytechnique

Vysšaja škola èkonomiki = National Research University Higher School of Economics [Moscow]

Modélisation en pharmacologie de population

Alexey Naumov

Fonction : Auteur
PersonId : 1083746

Vysšaja škola èkonomiki = National Research University Higher School of Economics [Moscow]

Vladislav Tadic

Fonction : Auteur
PersonId : 1083747

University of Bristol [Bristol]

Hoi-To Wai

Fonction : Auteur
PersonId : 1047263

Department of Systems Engineering and Engineering Management

Résumé

Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is o(1/k c) and the steady-state term is O(1/k), where c > 1 and k is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of Ω(1/k). A simple numerical experiment is presented to support our theory.

Mots clés

stochastic approximation reinforcement learning GTD learning Markovian noise

Domaines

Statistiques [math.ST]

Fichier principal

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise.pdf (1.28 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Eric Moulines : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03033458

Soumis le : mardi 1 décembre 2020-13:20:07

Dernière modification le : vendredi 26 avril 2024-13:07:14

Archivage à long terme le : mardi 2 mars 2021-19:16:21

Dates et versions

hal-03033458 , version 1 (01-12-2020)

Identifiants

HAL Id : hal-03033458 , version 1

Citer

Maxim Kaledin, Eric Moulines, Alexey Naumov, Vladislav Tadic, Hoi-To Wai. Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise. COLT 2020 - 33rd Conference on Learning Theory, Jul 2020, Graz / Virtual, Austria. ⟨hal-03033458⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X CNRS INRIA INSMI X-CMAP X-DEP-MATHA CMAP INRIA2 IP_PARIS GS-COMPUTER-SCIENCE

43 Consultations

160 Téléchargements

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager