Active Coverage for PAC Reinforcement Learning

Aymen Al-Marjani; Andrea Tirinzoni; Emilie Kaufmann

Communication Dans Un Congrès Année : 2023

Active Coverage for PAC Reinforcement Learning

(1) , (2) , (3, 4)

1
2
3
4

Aymen Al-Marjani

Fonction : Auteur

Unité de Mathématiques Pures et Appliquées

Andrea Tirinzoni

Fonction : Auteur
PersonId : 1286511

Meta AI Research [Paris]

Emilie Kaufmann

Fonction : Auteur
PersonId : 10422
IdHAL : emilie-kaufmann
ORCID : 0000-0002-5496-824X
IdRef : 197040810

Centre National de la Recherche Scientifique

Scool

Résumé

Collecting and leveraging data with good coverage properties plays a crucial role in different aspects of reinforcement learning (RL), including reward-free exploration and offline learning. However, the notion of "good coverage" really depends on the application at hand, as data suitable for one context may not be so for another. In this paper, we formalize the problem of active coverage in episodic Markov decision processes (MDPs), where the goal is to interact with the environment so as to fulfill given sampling requirements. This framework is sufficiently flexible to specify any desired coverage property, making it applicable to any problem that involves online exploration. Our main contribution is an instance-dependent lower bound on the sample complexity of active coverage and a simple game-theoretic algorithm, COVGAME, that nearly matches it. We then show that COVGAME can be used as a building block to solve different PAC RL tasks. In particular, we obtain a simple algorithm for PAC reward-free exploration with an instance-dependent sample complexity that, in certain MDPs which are "easy to explore", is lower than the minimax one. By further coupling this exploration algorithm with a new technique to do implicit eliminations in policy space, we obtain a computationally-efficient algorithm for best-policy identification whose instance-dependent sample complexity scales with gaps between policy values.

Mots clés

Reinforcement learning Coverage Reward-free exploration Best-policy identification

Domaines

Machine Learning [stat.ML]

Fichier principal

COLT23.pdf (830.98 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Aymen Al Marjani : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04215441

Soumis le : lundi 25 septembre 2023-15:19:34

Dernière modification le : jeudi 2 mai 2024-15:24:26

Archivage à long terme le : mardi 26 décembre 2023-18:15:53

Dates et versions

hal-04215441 , version 1 (25-09-2023)

Licence

Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales

Identifiants

HAL Id : hal-04215441 , version 1

Citer

Aymen Al-Marjani, Andrea Tirinzoni, Emilie Kaufmann. Active Coverage for PAC Reinforcement Learning. Conference on Learning Theory 2023, Jul 2023, Bangalore, India. ⟨hal-04215441⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-LYON CNRS INRIA INSMI CRISTAL INRIA2 UNIV-LILLE UDL CRISTAL-SCOOL ANR

19 Consultations

8 Téléchargements

Active Coverage for PAC Reinforcement Learning

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager