Generative Adversarial Networks for Multimodal Representation Learning in Video Hyperlinking

Vedran Vukotic 1 Christian Raymond 1 Guillaume Gravier 1
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA_D6 - MEDIA ET INTERACTIONS
Abstract : Continuous multimodal representations suitable for multimodal information retrieval are usually obtained with methods that heavily rely on multimodal autoencoders. In video hyperlinking, a task that aims at retrieving video segments, the state of the art is a variation of two interlocked networks working in opposing directions. ese systems provide good multimodal embeddings and are also capable of translating from one representation space to the other. Operating on representation spaces, these networks lack the ability to operate in the original spaces (text or image), which makes it diicult to visualize the crossmodal function, and do not generalize well to unseen data. Recently, generative adversarial networks have gained popularity and have been used for generating realistic synthetic data and for obtaining high-level, single-modal latent representation spaces. In this work, we evaluate the feasibility of using GANs to obtain multimodal representations. We show that GANs can be used for multimodal representation learning and that they provide multimodal representations that are superior to representations obtained with multimodal autoencoders. Additionally, we illustrate the ability of visualizing crossmodal translations that can provide human-interpretable insights on learned GAN-based video hyperlinking models.
Type de document :
Communication dans un congrès
ACM International Conference on Multimedia Retrieval (ICMR) 2017, Jun 2017, Bucharest, Romania. 2017, Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. <www.icmr2017.ro>. <10.1145/3078971.3079038>
Liste complète des métadonnées


https://hal.inria.fr/hal-01522419
Contributeur : Vedran Vukotić <>
Soumis le : lundi 15 mai 2017 - 09:34:39
Dernière modification le : mercredi 2 août 2017 - 10:10:34

Fichier

Vukotic_ICMR_2017.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Vedran Vukotic, Christian Raymond, Guillaume Gravier. Generative Adversarial Networks for Multimodal Representation Learning in Video Hyperlinking. ACM International Conference on Multimedia Retrieval (ICMR) 2017, Jun 2017, Bucharest, Romania. 2017, Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. <www.icmr2017.ro>. <10.1145/3078971.3079038>. <hal-01522419>

Partager

Métriques

Consultations de
la notice

221

Téléchargements du document

54