Skip to Main content Skip to Navigation
Habilitation à diriger des recherches

Exploring and Learning from Visual Data

Yannis Avrithis 1
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : This manuscript is about a journey. The journey of computer vision and machine learning research from the early years of Gabor filters and linear classifiers to surpassing human skills in several tasks today. The journey of the author's own research, designing representations and matching processes to explore visual data and exploring visual data to learn better representations. Part I addresses instance-level visual search and clustering, building on shallow visual representations and matching processes. The representation is obtained by a pipeline of local features, hand-crafted descriptors and visual vocabularies. Improvements in the pipeline are introduced, including the construction of large-scale vocabularies, spatial matching for geometry verification, representations beyond vocabularies and nearest neighbor search. Applications to exploring photo collections are discussed, including location recognition, landmark recognition and automatic discovery of photos depicting the same scene. Part II addresses instance-level visual search and object discovery, building on deep visual representations and matching processes, focusing on the manifold structure of the feature space. The representation is obtained by deep parametric models learned from visual data. Contributions are made to advancing manifold search over global or regional CNN representations. This process is seen as graph filtering, including spatial and spectral. Spatial matching is revisited with local features detected on CNN activations. Finally, a method is introduced for object discovery from CNN activations over an unlabeled image collection. Part III addresses learning deep visual representations by exploring visual data, focusing on limited or no supervision. It progresses from instance-level to category-level tasks and studies the sensitivity of models to their input. It introduces methods for unsupervised metric learning and semi-supervised learning, based again on the manifold structure of the feature space. It contributes to few-shot learning, studying activation maps and learning multiple layers to convergence for the first time. Finally, it introduces an attack as an attempt to improve upon the visual quality of adversarial examples in terms of imperceptibility. Part IV summarizes more of the author's past and present contributions, reflects on these contributions in the present context and consolidates the ideas exposed in this manuscript. It then attempts to draw a road map of ideas that are likely to come.
Keywords : computer vision
Document type :
Habilitation à diriger des recherches
Complete list of metadata
Contributor : Yannis Avrithis <>
Submitted on : Tuesday, December 8, 2020 - 9:53:37 PM
Last modification on : Friday, January 8, 2021 - 3:39:50 AM
Long-term archiving on: : Tuesday, March 9, 2021 - 8:16:22 PM


Files produced by the author(s)


  • HAL Id : tel-03047624, version 1


Yannis Avrithis. Exploring and Learning from Visual Data. Computer Vision and Pattern Recognition [cs.CV]. Université de Rennes 1, 2020. ⟨tel-03047624⟩



Record views


Files downloads