Temporal Difference Learning with Continuous Time and State in the Stochastic Setting

Ziad Kobeissi; Francis Bach

Pré-Publication, Document De Travail Année : 2023

Temporal Difference Learning with Continuous Time and State in the Stochastic Setting

(1, 2) , (2, 3, 4)

1
2
3
4

Ziad Kobeissi

Fonction : Auteur
PersonId : 1126035

Institut Louis Bachelier

Statistical Machine Learning and Parsimony

Francis Bach

Fonction : Auteur
PersonId : 863086

Statistical Machine Learning and Parsimony

Département d'informatique - ENS Paris

Université Paris sciences et lettres

Résumé

We consider the problem of continuous-time policy evaluation. This consists in learning through observations the value function associated with an uncontrolled continuous-time stochastic dynamic and a reward function. We propose two original variants of the well-known TD(0) method using vanishing time steps. One is model-free and the other is model-based. For both methods, we prove theoretical convergence rates that we subsequently verify through numerical simulations. Alternatively, those methods can be interpreted as novel reinforcement learning approaches for approximating solutions of linear PDEs (partial differential equations) or linear BSDEs (backward stochastic differential equations).

Mots clés

Reinforcement learning continuous time partial differential equations backward stochastic differential equations temporal learning

Domaines

Apprentissage [cs.LG] Intelligence artificielle [cs.AI] Equations aux dérivées partielles [math.AP] Analyse fonctionnelle [math.FA] Optimisation et contrôle [math.OC]

Fichier principal

v4_june2023.pdf (624.25 Ko)

Origine : Fichiers produits par l'(les) auteur(s)
Licence : CC BY - Paternité

Ziad Kobeissi : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03574645

Soumis le : lundi 5 juin 2023-11:38:59

Dernière modification le : vendredi 5 avril 2024-14:07:02

Dates et versions

hal-03574645 , version 1 (15-02-2022)

hal-03574645 , version 2 (10-06-2022)

hal-03574645 , version 3 (05-06-2023)

Licence

Paternité

Identifiants

HAL Id : hal-03574645 , version 3
ARXIV : 2202.07960

Citer

Ziad Kobeissi, Francis Bach. Temporal Difference Learning with Continuous Time and State in the Stochastic Setting. 2023. ⟨hal-03574645v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 TDS-MACS PSL ANR PRAIRIE-IA

105 Consultations

134 Téléchargements

Temporal Difference Learning with Continuous Time and State in the Stochastic Setting

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager