End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries - Archive ouverte HAL Access content directly
Conference Papers Year :

End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries

(1, 2) , (3) , (1, 2) , (4) , (3) , (4)
1
2
3
4

Abstract

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning. Yet, most current approaches cast human-machine dialogue management as a supervised learning problem, aiming at predicting the next utterance of a participant given the full history of the dialogue. This vision is too simplistic to render the intrinsic planning problem inherent to dialogue as well as its grounded nature , making the context of a dialogue larger than the sole history. This is why only chitchat and question answering tasks have been addressed so far using end-to-end architectures. In this paper, we introduce a Deep Reinforcement Learning method to optimize visually grounded task-oriented dialogues , based on the policy gradient algorithm. This approach is tested on a dataset of 120k dialogues collected through Mechanical Turk and provides encouraging results at solving both the problem of generating natural dialogues and the task of discovering a specific object in a complex picture.
Fichier principal
Vignette du fichier
1703.05423.pdf (3.44 Mo) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01549642 , version 1 (28-06-2017)

Licence

Attribution - CC BY 4.0

Identifiers

Cite

Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, et al.. End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries. International Joint Conference on Artificial Intelligence, Aug 2017, Melbourne, Australia. ⟨hal-01549642⟩
433 View
246 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More