Skip to Main content Skip to Navigation
Conference papers

End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries

Abstract : End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning. Yet, most current approaches cast human-machine dialogue management as a supervised learning problem, aiming at predicting the next utterance of a participant given the full history of the dialogue. This vision is too simplistic to render the intrinsic planning problem inherent to dialogue as well as its grounded nature , making the context of a dialogue larger than the sole history. This is why only chitchat and question answering tasks have been addressed so far using end-to-end architectures. In this paper, we introduce a Deep Reinforcement Learning method to optimize visually grounded task-oriented dialogues , based on the policy gradient algorithm. This approach is tested on a dataset of 120k dialogues collected through Mechanical Turk and provides encouraging results at solving both the problem of generating natural dialogues and the task of discovering a specific object in a complex picture.
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download
Contributor : Florian Strub Connect in order to contact the contributor
Submitted on : Wednesday, June 28, 2017 - 11:44:06 PM
Last modification on : Friday, January 21, 2022 - 3:13:05 AM
Long-term archiving on: : Thursday, January 18, 2018 - 2:40:26 AM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : hal-01549642, version 1
  • ARXIV : 1703.05423


Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, et al.. End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries. International Joint Conference on Artificial Intelligence, Aug 2017, Melbourne, Australia. ⟨hal-01549642⟩



Les métriques sont temporairement indisponibles