Task Completion Transfer Learning for Reward Inference

Layla El Asri; Romain Laroche; Olivier Pietquin

Communication Dans Un Congrès Année : 2014

Task Completion Transfer Learning for Reward Inference

(1, 2) , (2) , (3)

1
2
3

Layla El Asri

Fonction : Auteur
PersonId : 932394

Georgia Tech Lorraine [Metz]

Orange Labs [Issy les Moulineaux]

Romain Laroche

Fonction : Auteur

Orange Labs [Issy les Moulineaux]

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Laboratoire d'Informatique Fondamentale de Lille

Résumé

Reinforcement learning-based spoken dialogue systems aim to compute an optimal strategy for dialogue management from interactions with users. They compare their different management strategies on the basis of a numerical reward function. Reward inference consists of learning a reward function from dialogues scored by users. A major issue for reward inference algorithms is that important parameters influence user evaluations and cannot be computed online. This is the case of task completion. This paper introduces Task Completion Transfer Learning (TCTL): a method to exploit the exact knowledge of task completion on a corpus of dialogues scored by users in order to optimise online learning. Compared to previously proposed reward inference techniques, TCTL returns a reward function enhanced with the possibility to manage the online non-observability of task completion. A reward function is learnt with TCTL on dialogues with a restaurant seeking system. It is shown that the reward function returned by TCTL is a better estimator of dialogue performance than the one returned by reward inference.

Domaines

Informatique [cs] Sciences de l'ingénieur [physics]

Olivier Pietquin : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01107500

Soumis le : mardi 20 janvier 2015-18:32:11

Dernière modification le : jeudi 13 avril 2023-09:26:12

Dates et versions

hal-01107500 , version 1 (20-01-2015)

Identifiants

HAL Id : hal-01107500 , version 1

Citer

Layla El Asri, Romain Laroche, Olivier Pietquin. Task Completion Transfer Learning for Reward Inference. International Workshop on Machine Learning for Interactive Systems (MLIS 2014), Jul 2014, Québec, Canada. ⟨hal-01107500⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC UNIV-LILLE3 CNRS INRIA UNIV-FCOMTE UMI-GTL

105 Consultations

0 Téléchargements

Task Completion Transfer Learning for Reward Inference

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager