Skip to Main content Skip to Navigation

Learning to detect visual relations

Julia Peyre 1, 2
Abstract : In this thesis, we study the problem of detection of visual relations of the form (subject, predicate, object) in images, which are intermediate level semantic units between objects and complex scenes. Our work addresses two main challenges in visual relation detection: (1) the difficulty of obtaining box-level annotations to train fully-supervised models, (2) the variability of appearance of visual relations. We first propose a weakly-supervised approach which, given pre-trained object detectors, enables us to learn relation detectors using image-level labels only, maintaining a performance close to fully-supervised models. Second, we propose a model that combines different granularities of embeddings (for subject, object, predicate and triplet) to better model appearance variation and introduce an analogical reasoning module to generalize to unseen triplets. Experimental results demonstrate the improvement of our hybrid model over a purely compositional model and validate the benefits of our transfer by analogy to retrieve unseen triplets.
Complete list of metadata

Cited literature [299 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Monday, March 23, 2020 - 5:58:26 PM
Last modification on : Thursday, October 29, 2020 - 3:01:42 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02332673, version 2



Julia Peyre. Learning to detect visual relations. Artificial Intelligence [cs.AI]. Université Paris sciences et lettres, 2019. English. ⟨NNT : 2019PSLEE016⟩. ⟨tel-02332673v2⟩



Record views


Files downloads