Exponential Asymptotic Optimality of Whittle Index Policy.

Nicolas Gast; Bruno Gaujal; Chen Yan

doi:10.1007/s11134-023-09875-x

Article Dans Une Revue Queueing Systems Année : 2023

Exponential Asymptotic Optimality of Whittle Index Policy.

(1) , (1) , (1)

Nicolas Gast

Fonction : Auteur
PersonId : 1247
IdHAL : nicolas-gast
ORCID : 0000-0001-6884-8698
IdRef : 233247874

Performance analysis and optimization of LARge Infrastructures and Systems

Bruno Gaujal

Fonction : Auteur
PersonId : 11644
IdHAL : bruno-gaujal
ORCID : 0000-0001-9081-8401
IdRef : 074658441

Performance analysis and optimization of LARge Infrastructures and Systems

Chen Yan

Fonction : Auteur
PersonId : 1084438

Performance analysis and optimization of LARge Infrastructures and Systems

Résumé

We evaluate the performance of Whittle index policy for restless Markovian bandit. It is shown in Weber and Weiss (J Appl Probab 27(3):637–648, 1990) that if the bandit is indexable and the associated deterministic system has a global attractor fixed point, then the Whittle index policy is asymptotically optimal in the regime where the arm population grows proportionally with the number of activation arms. In this paper, we show that, under the same conditions, this convergence rate is exponential in the arm population, unless the fixed point is singular (to be defined later), which almost never happens in practice. Our result holds for the continuous-time model of Weber and Weiss (1990) and for a discrete-time model in which all bandits make synchronous transitions. Our proof is based on the nature of the deterministic equation governing the stochastic system: We show that it is a piecewise affine continuous dynamical system inside the simplex of the empirical measure of the arms. Using simulations and numerical solvers, we also investigate the singular cases, as well as how the level of singularity influences the (exponential) convergence rate. We illustrate our theorem on a Markovian fading channel model.

Mots clés

MSC2020 subject classifications: Primary 90C40 secondary 37H12 60F10 68M20 Multi-armed Bandits Whittle Index Asymptotic Optimality

Domaines

Probabilités [math.PR] Performance et fiabilité [cs.PF] Optimisation et contrôle [math.OC]

Fichier principal

optimality_whittle.pdf (1.03 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Nicolas Gast : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03041176

Soumis le : mercredi 19 juillet 2023-17:36:16

Dernière modification le : jeudi 4 avril 2024-21:08:00

Dates et versions

hal-03041176 , version 1 (11-12-2020)

hal-03041176 , version 2 (19-07-2023)

Licence

Paternité

Identifiants

HAL Id : hal-03041176 , version 2
ARXIV : 2012.09064
DOI : 10.1007/s11134-023-09875-x

Citer

Nicolas Gast, Bruno Gaujal, Chen Yan. Exponential Asymptotic Optimality of Whittle Index Policy.. Queueing Systems, 2023, 104, pp.1-44. ⟨10.1007/s11134-023-09875-x⟩. ⟨hal-03041176v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_SRCPR INRIA2 TDS-MACS LIG-SRCPR-POLARIS ANR LIG_SIDCH

109 Consultations

182 Téléchargements

Exponential Asymptotic Optimality of Whittle Index Policy.

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager