Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm

Róbert Busa-Fekete 1, 2 Balázs Szörényi 3, 1 Paul Weng 4 Weiwei Cheng 2 Eyke Hüllermeier 2
3 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
4 DECISION
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : We introduce a novel approach to preference-based reinforcement learn-ing, namely a preference-based variant of a direct policy search method based on evolutionary optimization. The core of our approach is a preference-based racing algorithm that selects the best among a given set of candidate policies with high probability. To this end, the algorithm operates on a suitable ordinal preference structure and only uses pairwise comparisons between sample rollouts of the policies. Embedding the racing algorithm in a rank-based evolutionary search procedure, we show that approxima-tions of the so-called Smith set of optimal policies can be produced with certain theoretical guarantees. Apart from a formal performance and complexity analysis, we present first experimental studies showing that our approach performs well in practice.
Type de document :
Article dans une revue
Machine Learning, Springer Verlag, 2014, 97 (3), pp.327-351. 〈10.1007/s10994-014-5458-8〉
Liste complète des métadonnées

Littérature citée [29 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01079370
Contributeur : Balazs Szorenyi <>
Soumis le : samedi 1 novembre 2014 - 12:27:47
Dernière modification le : samedi 7 avril 2018 - 14:30:02
Document(s) archivé(s) le : lundi 2 février 2015 - 16:52:18

Fichier

revised_1_1.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Róbert Busa-Fekete, Balázs Szörényi, Paul Weng, Weiwei Cheng, Eyke Hüllermeier. Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm. Machine Learning, Springer Verlag, 2014, 97 (3), pp.327-351. 〈10.1007/s10994-014-5458-8〉. 〈hal-01079370〉

Partager

Métriques

Consultations de la notice

285

Téléchargements de fichiers

347