Differential Evolution Algorithm Applied to Non-Stationary Bandit Problem

David L. St-Pierre 1, 2 Jialin Liu 1, 3
1 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
2 Montefiore institute
LRI - Laboratoire de Recherche en Informatique, Institut Montefiore - Department of Electrical Engineering and Computer Science
Abstract : In this paper we compare Differential Evolution (DE), an evolutionary algorithm, to classical bandit algorithms over the non-stationary bandit problem. First we define a testcase where the variation of the distributions depends on the number of times an option is evaluated rather than over time. This definition allows the possibility to apply these algorithms over a wide range of problems such as black-box portfolio selection. Second we present our own variant of discounted Upper Confidence Bound (UCB) algorithm that outperforms the current state-of-the-art algorithms for the non-stationary bandit problem. Third, we introduce a variant of DE and show that, on a selection over a portfolio of solvers for the Cart-Pole problem, our version of DE outperforms the current best UCB algorithms.
Type de document :
Communication dans un congrès
2014 IEEE Congress on Evolutionary Computation (IEEE CEC 2014), Jul 2014, Beijing, China. 2014
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00979456
Contributeur : Jialin Liu <>
Soumis le : mercredi 16 juillet 2014 - 07:00:15
Dernière modification le : jeudi 5 avril 2018 - 12:30:24
Document(s) archivé(s) le : jeudi 20 novembre 2014 - 15:37:44

Fichier

main.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00979456, version 1

Collections

Citation

David L. St-Pierre, Jialin Liu. Differential Evolution Algorithm Applied to Non-Stationary Bandit Problem. 2014 IEEE Congress on Evolutionary Computation (IEEE CEC 2014), Jul 2014, Beijing, China. 2014. 〈hal-00979456〉

Partager

Métriques

Consultations de la notice

544

Téléchargements de fichiers

207