Aggregation of Multi-Armed Bandits Learning Algorithms for Opportunistic Spectrum Access

Multi-armed bandit algorithms have been recently studied and evaluated for Cognitive Radio (CR), especially in the context of Opportunistic Spectrum Access (OSA). Several solutions have been explored based on various models, but it is hard to exactly predict which could be the best for real-world conditions at every instants. Hence, expert aggregation algorithms can be useful to select on the run the best algorithm for a specific situation. Aggregation algorithms, such as Exp4 dating back from 2002, have never been used for OSA learning, and we show that it appears empirically sub-efficient when applied to simple stochastic problems. In this article, we present an improved variant, called Aggregator. For synthetic OSA problems modeled as Multi-Armed Bandit (MAB) problems, simulation results are presented to demonstrate its empirical efficiency. We combine classical algorithms, such as Thompson sampling, Upper-Confidence Bounds algorithms (UCB and variants), and Bayesian or Kullback-Leibler UCB. Our algorithm offers good performance compared to state-of-the-art algorithms (Exp4, CORRAL or LearnExp), and appears as a robust approach to select on the run the best algorithm for any stochastic MAB problem, being more realistic to real-world radio settings than any tuning-based approach.

Des algorithmes de bandits multi-bras ont récemment été étudiés et évalués pour la radio cognitive (CR), en particulier dans le contexte de l'accès opportuniste au spectre (OSA). Plusieurs solutions ont été explorées sur la base de différents modèles, mais il est difficile de prédire exactement lesquelles pourraient être les meilleures pour des conditions réelles à chaque instant. Par conséquent, les algorithmes d'agrégation experts peuvent être utiles pour sélectionner au cours de l'exécution le meilleur algorithme pour une situation spécifique. Les algorithmes d'agrégation, comme Exp4 datant de 2002, n'ont jamais été utilisés pour l'apprentissage de l'OSA, et nous montrons qu'ils semblent empiriquement sous-efficaces lorsqu'ils sont appliqués à des problèmes stochastiques simples. Dans cet article, nous présentons une variante améliorée, appelée Aggregator. Pour les problèmes d'AOS synthétiques modélisés sous forme de problèmes de Bandit Multi-Armed Bandit (MAB), les résultats de simulation sont présentés pour démontrer son efficacité empirique. Nous combinons des algorithmes classiques, tels que l'échantillonnage Thompson, les algorithmes Upper-Confidence Bounds (UCB et variantes) et Bayesian ou Kullback-Leibler UCB. Notre algorithme offre de bonnes performances par rapport aux algorithmes de pointe (Exp4, CORRAL ou LearnExp), et apparaît comme une approche robuste pour sélectionner en cours d'exécution le meilleur algorithme pour n'importe quel problème stochastique MAB, étant plus réaliste aux paramètres radio du monde réel que n'importe quelle approche basée sur le paramétrage manuel.

Mots clés

Aggregation algorithm Multi-Armed Bandits Reinforcement Learning Learning theory Aggregation of estimators Cognitive radio

Domaines

Réseaux et télécommunications [cs.NI] Machine Learning [stat.ML]

Fichier principal

IEEE_WCNC__2018__Paper__Lilian_Besson__07-17.pdf (286.85 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Lilian Besson : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01705292

Soumis le : vendredi 9 février 2018-11:51:18

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : vendredi 4 mai 2018-03:45:44

Dates et versions

hal-01705292 , version 1 (09-02-2018)

Licence

Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales

Identifiants

HAL Id : hal-01705292 , version 1
DOI : 10.1109/wcnc.2018.8377070

Citer

Lilian Besson, Emilie Kaufmann, Christophe Moy. Aggregation of Multi-Armed Bandits Learning Algorithms for Opportunistic Spectrum Access. IEEE WCNC - IEEE Wireless Communications and Networking Conference, Apr 2018, Barcelona, Spain. ⟨10.1109/wcnc.2018.8377070⟩. ⟨hal-01705292⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES UNIV-RENNES1 CNRS INRIA INSA-RENNES IETR SUP_SCEE SUP_IETR IETR_SCEE CENTRALESUPELEC CRISTAL INRIA2 CRISTAL-SEQUEL UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UNIV-LILLE INSA-GROUPE ANR UR1-MATH-NUM IETR-ASIC NANTES-UNIVERSITE

761 Consultations

694 Téléchargements