End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries

Florian Strub 1, 2 Harm De Vries 3 Jeremie Mary 1, 2 Bilal Piot 4 Aaron Courville 3 Olivier Pietquin 4
1 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning. Yet, most current approaches cast human-machine dialogue management as a supervised learning problem, aiming at predicting the next utterance of a participant given the full history of the dialogue. This vision is too simplistic to render the intrinsic planning problem inherent to dialogue as well as its grounded nature , making the context of a dialogue larger than the sole history. This is why only chitchat and question answering tasks have been addressed so far using end-to-end architectures. In this paper, we introduce a Deep Reinforcement Learning method to optimize visually grounded task-oriented dialogues , based on the policy gradient algorithm. This approach is tested on a dataset of 120k dialogues collected through Mechanical Turk and provides encouraging results at solving both the problem of generating natural dialogues and the task of discovering a specific object in a complex picture.
Type de document :
Communication dans un congrès
International Joint Conference on Artificial Intelligence, Aug 2017, Melbourne, Australia. 〈https://ijcai-17.org/〉
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01549642
Contributeur : Florian Strub <>
Soumis le : mercredi 28 juin 2017 - 23:44:06
Dernière modification le : jeudi 11 janvier 2018 - 06:27:32

Fichier

1703.05423.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

  • HAL Id : hal-01549642, version 1
  • ARXIV : 1703.05423

Collections

Citation

Florian Strub, Harm De Vries, Jeremie Mary, Bilal Piot, Aaron Courville, et al.. End-to-end optimization of goal-driven and visually grounded dialogue systems Harm de Vries. International Joint Conference on Artificial Intelligence, Aug 2017, Melbourne, Australia. 〈https://ijcai-17.org/〉. 〈hal-01549642〉

Partager

Métriques

Consultations de la notice

213

Téléchargements de fichiers

219