Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm

Róbert Busa-Fekete 1, 2 Balázs Szörényi 3, 1 Paul Weng 4 Weiwei Cheng 2 Eyke Hüllermeier 2
3 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
4 DECISION
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : We introduce a novel approach to preference-based reinforcement learn-ing, namely a preference-based variant of a direct policy search method based on evolutionary optimization. The core of our approach is a preference-based racing algorithm that selects the best among a given set of candidate policies with high probability. To this end, the algorithm operates on a suitable ordinal preference structure and only uses pairwise comparisons between sample rollouts of the policies. Embedding the racing algorithm in a rank-based evolutionary search procedure, we show that approxima-tions of the so-called Smith set of optimal policies can be produced with certain theoretical guarantees. Apart from a formal performance and complexity analysis, we present first experimental studies showing that our approach performs well in practice.
Document type :
Journal articles
Complete list of metadatas

Cited literature [29 references]  Display  Hide  Download

https://hal.inria.fr/hal-01079370
Contributor : Balazs Szorenyi <>
Submitted on : Saturday, November 1, 2014 - 12:27:47 PM
Last modification on : Tuesday, November 26, 2019 - 4:12:08 PM
Long-term archiving on: Monday, February 2, 2015 - 4:52:18 PM

File

revised_1_1.pdf
Files produced by the author(s)

Identifiers

Citation

Róbert Busa-Fekete, Balázs Szörényi, Paul Weng, Weiwei Cheng, Eyke Hüllermeier. Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm. Machine Learning, Springer Verlag, 2014, 97 (3), pp.327-351. ⟨10.1007/s10994-014-5458-8⟩. ⟨hal-01079370⟩

Share

Metrics

Record views

476

Files downloads

671