LEAR and XRCE's participation to Visual Concept Detection Task - ImageCLEF 2010

Thomas Mensink; Gabriela Csurka; Florent Perronnin; Jorge Sánchez; Jakob Verbeek

Communication Dans Un Congrès Année : 2010

LEAR and XRCE's participation to Visual Concept Detection Task - ImageCLEF 2010

(1, 2) , (1) , (1) , (1) , (2)

1
2

Thomas Mensink

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Learning and recognition in vision

Gabriela Csurka

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Florent Perronnin

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Jorge Sánchez

Fonction : Auteur

Xerox Research Centre Europe [Meylan]

Jakob Verbeek

Fonction : Auteur
PersonId : 10676
IdHAL : verbeek
ORCID : 0000-0003-1419-1816
IdRef : 180998463

Learning and recognition in vision

Résumé

In this paper we present the common effort of Lear and XRCE for the ImageCLEF Visual Concept Detection and Annotation Task. We first sought to combine our individual state-of-the-art approaches: the Fisher vector image representation, with the TagProp method for image auto-annotation. Our second motivation was to investigate the annotation performance by using extra information in the form of provided Flickr-tags. The results show that using the Flickr-tags in combination with visual features improves the results of any method using only visual features. Our winning system, an early-fusion linear-SVM classifier, trained on visual and Flickr-tags features, obtains 45.5% in mean Average Precision (mAP), almost a 5% absolute improvement compared to the best visual-only system. Our best visual-only system obtains 39.0% mAP, and is close to the best visual-only system. It is a late-fusion linear-SVM classifier, trained on two types of visual features (SIFT and colour). The performance of TagProp is close to our SVM classifiers. The methods presented in this paper, are all scalable to large datasets and/or many concepts. This is due to the fast FK framework for image representation, and due to the classifiers. The linear SVM classifier has proven to scale well for large datasets. The k-NN approach of TagProp, is interesting in this respect since it requires only 2 parameters per concept.

Mots clés

Image Classification Auto Annotation Multi-Modal Linear SVM Fisher Vectors TagProp

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

LEAR.XRCE.ImageClef.2010.pdf (162.41 Ko)

MCPSV.png (301.24 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Format : Figure, Image

THOTH Team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00548633

Soumis le : lundi 20 décembre 2010-10:22:23

Dernière modification le : jeudi 4 avril 2024-20:51:20

Archivage à long terme le : lundi 5 novembre 2012-14:36:44

Dates et versions

inria-00548633 , version 1 (20-12-2010)

Identifiants

HAL Id : inria-00548633 , version 1

Citer

Thomas Mensink, Gabriela Csurka, Florent Perronnin, Jorge Sánchez, Jakob Verbeek. LEAR and XRCE's participation to Visual Concept Detection Task - ImageCLEF 2010. ImageCLEF - Workshop Cross Language Image Retrieval, Sep 2010, Padua, Italy. pp.48. ⟨inria-00548633⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA LJK LJK_GI LJK_GI_LEAR INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

562 Consultations

232 Téléchargements

LEAR and XRCE's participation to Visual Concept Detection Task - ImageCLEF 2010

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager