Joint Embeddings of Scene Graphs and Images

Abstract : Multimodal representations of text and images have become popular in recent years. Text however has inherent ambiguities when describing visual scenes, leading to the recent development of datasets with detailed graphical descriptions in the form of scene graphs. We consider the task of joint representation of semantically precise scene graphs and images. We propose models for representing scene graphs and aligning them with images. We investigate methods based on bag-of-words, subpath representations, as well as neural networks. Our investigation proposes and contrasts several models which can address this task and highlights some unique challenges in both designing models and evaluation.
Type de document :
Poster
International Conference On Learning Representations - Workshop, 2017, Toulon, France
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01667777
Contributeur : Eugene Belilovsky <>
Soumis le : mardi 19 décembre 2017 - 15:42:13
Dernière modification le : jeudi 7 février 2019 - 17:29:14

Identifiants

  • HAL Id : hal-01667777, version 1

Citation

Eugene Belilovsky, Matthew Blaschko, Jamie Kiros, Raquel Urtasun, Richard Zemel. Joint Embeddings of Scene Graphs and Images. International Conference On Learning Representations - Workshop, 2017, Toulon, France. 〈hal-01667777〉

Partager

Métriques

Consultations de la notice

66

Téléchargements de fichiers

38