Non-Asymptotic Analysis of a UCB-based Top Two Algorithm - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

Non-Asymptotic Analysis of a UCB-based Top Two Algorithm

Marc Jourdan
Rémy Degenne
  • Fonction : Auteur
  • PersonId : 748911
  • IdHAL : remydegenne

Résumé

A Top Two sampling rule for bandit identification is a method which selects the next arm to sample from among two candidate arms, a leader and a challenger. Due to their simplicity and good empirical performance, they have received increased attention in recent years. For fixed-confidence best arm identification, theoretical guarantees for Top Two methods have only been obtained in the asymptotic regime, when the error level vanishes. We derive the first non-asymptotic upper bound on the expected sample complexity of a Top Two algorithm holding for any error level. Our analysis highlights sufficient properties for a regret minimization algorithm to be used as leader. They are satisfied by the UCB algorithm and our proposed UCB-based Top Two algorithm enjoys simultaneously non-asymptotic guarantees and competitive empirical performance.

Dates et versions

hal-03830958 , version 1 (26-10-2022)

Licence

Paternité

Identifiants

Citer

Marc Jourdan, Rémy Degenne. Non-Asymptotic Analysis of a UCB-based Top Two Algorithm. Thirty-seventh Conference on Neural Information Processing Systems, Dec 2023, New Orleans (Louisiana), United States. ⟨hal-03830958⟩
35 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More