New interface

# Xi-Learning: Successor Feature Transfer Learning for General Reward Functions

Abstract : Transfer in Reinforcement Learning aims to improve learning performance on target tasks using knowledge from experienced source tasks. Successor features (SF) are a prominent transfer mechanism in domains where the reward function changes between tasks. They reevaluate the expected return of previously learned policies in a new target task and to transfer their knowledge. A limiting factor of the SF framework is its assumption that rewards linearly decompose into successor features and a reward weight vector. We propose a novel SF mechanism, $\xi$-learning, based on learning the cumulative discounted probability of successor features. Crucially, $\xi$-learning allows to reevaluate the expected return of policies for general reward functions. We introduce two $\xi$-learning variations, prove its convergence, and provide a guarantee on its transfer performance. Experimental evaluations based on $\xi$-learning with function approximation demonstrate the prominent advantage of $\xi$-learning over available mechanisms not only for general reward functions, but also in the case of linearly decomposable reward functions.
Document type :
Preprints, Working Papers, ...

https://hal.inria.fr/hal-03426870
Contributor : Xavier Alameda-Pineda Connect in order to contact the contributor
Submitted on : Friday, November 12, 2021 - 4:08:17 PM
Last modification on : Wednesday, May 4, 2022 - 12:00:02 PM

### Identifiers

• HAL Id : hal-03426870, version 1
• ARXIV : 2110.15701

### Citation

Chris Reinke, Xavier Alameda-Pineda. Xi-Learning: Successor Feature Transfer Learning for General Reward Functions. {date}. ⟨hal-03426870⟩

Record views