Finding the bandit in a graph: Sequential search-and-stop

Pierre Perrault; Vianney Perchet; Michal Valko

Communication Dans Un Congrès Année : 2019

Finding the bandit in a graph: Sequential search-and-stop

(1, 2) , (2) , (1, 3)

1
2
3

Pierre Perrault

Fonction : Auteur
PersonId : 1059525

Sequential Learning

Centre de Mathématiques et de Leurs Applications

Vianney Perchet

Fonction : Auteur
PersonId : 871881

Centre de Mathématiques et de Leurs Applications

Michal Valko

Fonction : Auteur
PersonId : 284
IdHAL : michal
IdRef : 22360934X

Sequential Learning

DeepMind [Paris]

Résumé

We consider the problem where an agent wants to find a hidden object that is randomly located in some vertex of a directed acyclic graph (DAG) according to a fixed but possibly unknown distribution. The agent can only examine vertices whose in-neighbors have already been examined. In this paper, we address a learning setting where we allow the agent to stop before having found the object and restart searching on a new independent instance of the same problem. Our goal is to maximize the total number of hidden objects found given a time budget. The agent can thus skip an instance after realizing that it would spend too much time on it. Our contributions are both to the search theory and multi-armed bandits. If the distribution is known, we provide a quasi-optimal and efficient stationary strategy. If the distribution is unknown, we additionally show how to sequentially approximate it and, at the same time, act near-optimally in order to collect as many hidden objects as possible.

Domaines

Machine Learning [stat.ML]

Fichier principal

perrault2019finding.pdf (680.06 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Michal Valko : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02387465

Soumis le : vendredi 29 novembre 2019-17:58:10

Dernière modification le : lundi 8 avril 2024-12:24:02

Dates et versions

hal-02387465 , version 1 (29-11-2019)

Identifiants

HAL Id : hal-02387465 , version 1

Citer

Pierre Perrault, Vianney Perchet, Michal Valko. Finding the bandit in a graph: Sequential search-and-stop. 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Apr 2019, Okinawa, Japan. ⟨hal-02387465⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA ENS-CACHAN CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-PARIS-SACLAY UNIV-LILLE ENS-PARIS-SACLAY

75 Consultations

131 Téléchargements

Finding the bandit in a graph: Sequential search-and-stop

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager