HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

Learning to detect visual relations

Abstract : In this thesis, we study the problem of detection of visual relations of the form (subject, predicate, object) in images, which are intermediate level semantic units between objects and complex scenes. Our work addresses two main challenges in visual relation detection: (1) the difficulty of obtaining box-level annotations to train fully-supervised models, (2) the variability of appearance of visual relations. We first propose a weakly-supervised approach which, given pre-trained object detectors, enables us to learn relation detectors using image-level labels only, maintaining a performance close to fully-supervised models. Second, we propose a model that combines different granularities of embeddings (for subject, object, predicate and triplet) to better model appearance variation and introduce an analogical reasoning module to generalize to unseen triplets. Experimental results demonstrate the improvement of our hybrid model over a purely compositional model and validate the benefits of our transfer by analogy to retrieve unseen triplets.
Complete list of metadata

Cited literature [299 references]  Display  Hide  Download

Contributor : Abes Star :  Contact
Submitted on : Monday, March 23, 2020 - 5:58:26 PM
Last modification on : Thursday, March 17, 2022 - 10:08:54 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02332673, version 2



Julia Peyre. Learning to detect visual relations. Artificial Intelligence [cs.AI]. Université Paris sciences et lettres, 2019. English. ⟨NNT : 2019PSLEE016⟩. ⟨tel-02332673v2⟩



Record views


Files downloads