Dynamic Multi-Armed Bandits and Extreme Value-based Rewards for Adaptive Operator Selection in Evolutionary Algorithms

Álvaro Fialho 1, * Luis Da Costa 2, 3 Marc Schoenauer 1, 2, 3 Michèle Sebag 1, 2, 3
* Auteur correspondant
2 TAO - Machine Learning and Optimisation
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : The performance of many efficient algorithms critically depends on the tuning of their parameters, which on turn depends on the problem at hand. For example, the performance of Evolutionary Algorithms critically depends on the judicious setting of the operator rates. The Adaptive Operator Selection (AOS) heuristic that is proposed here rewards each operator based on the extreme value of the fitness improvement lately incurred by this operator, and uses a Multi-Armed Bandit (MAB) selection process based on those rewards to choose which operator to apply next. This Extreme-based Multi-Armed Bandit approach is experimentally validated against the Average-based MAB method, and is shown to outperform previously published methods, whether using a classical Average-based rewarding technique or the same Extreme-based mechanism. The validation test suite includes the easy One-Max problem and a family of hard problems known as "Long k-paths".
Type de document :
Communication dans un congrès
Learning and Intelligent Optimization (LION 3), Jan 2009, Trento, Italy. 5851/2009, pp.176-190, 2009, Lecture Notes in Computer Science. 〈10.1007/978-3-642-11169-3_13〉
Liste complète des métadonnées

Littérature citée [7 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00377401
Contributeur : Álvaro Fialho <>
Soumis le : mardi 23 juin 2009 - 01:19:22
Dernière modification le : jeudi 30 novembre 2017 - 01:20:57
Document(s) archivé(s) le : mercredi 22 septembre 2010 - 12:35:09

Fichiers

Identifiants

Collections

Citation

Álvaro Fialho, Luis Da Costa, Marc Schoenauer, Michèle Sebag. Dynamic Multi-Armed Bandits and Extreme Value-based Rewards for Adaptive Operator Selection in Evolutionary Algorithms. Learning and Intelligent Optimization (LION 3), Jan 2009, Trento, Italy. 5851/2009, pp.176-190, 2009, Lecture Notes in Computer Science. 〈10.1007/978-3-642-11169-3_13〉. 〈inria-00377401v2〉

Partager

Métriques

Consultations de la notice

530

Téléchargements de fichiers

836