Image Annotation with TagProp on the MIRFLICKR set

Jakob Verbeek 1 Matthieu Guillaumin 1 Thomas Mensink 2 Cordelia Schmid 1
1 LEAR - Learning and recognition in vision
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : Image annotation is an important computer vision problem where the goal is to determine the relevance of annotation terms for images. Image annotation has two main applications: (i) proposing a list of relevant terms to users that want to assign indexing terms to images, and (ii) supporting keyword based search for images without indexing terms, using the relevance estimates to rank images. In this paper we present TagProp, a weighted nearest neighbour model that predicts the term relevance of images by taking a weighted sum of the annotations of the visually most similar images in an annotated training set. TagProp can use a collection of distance measures capturing different aspects of image content, such as local shape descriptors, and global colour histograms. It automatically finds the optimal combination of distances to define the visual neighbours of images that are most useful for annotation prediction. TagProp compensates for the varying frequencies of annotation terms using a term-specific sigmoid to scale the weighted nearest neighbour tag predictions. We evaluate different variants of TagProp with experiments on the MIR Flickr set, and compare with an approach that learns a separate SVM classifier for each annotation term. We also consider using Flickr tags to train our models, both as additional features and as training labels. We find the SVMs to work better when learning from the manual annotations, but TagProp to work better when learning from the Flickr tags. We also find that using the Flickr tags as a feature can significantly improve the performance of SVMs learned from manual annotations.
keyword : Image annotation
Type de document :
Communication dans un congrès
MIR 2010 - 11th ACM International Conference on Multimedia Information Retrieval, Mar 2010, Philadephia, United States. ACM Press, pp.537-546, 2010, <10.1145/1743384.1743476>
Liste complète des métadonnées



https://hal.inria.fr/inria-00548628
Contributeur : Thoth Team <>
Soumis le : mercredi 6 juillet 2011 - 12:47:36
Dernière modification le : mercredi 9 juillet 2014 - 21:08:34
Document(s) archivé(s) le : dimanche 4 décembre 2016 - 18:44:19

Fichiers

verbeek10mir.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Jakob Verbeek, Matthieu Guillaumin, Thomas Mensink, Cordelia Schmid. Image Annotation with TagProp on the MIRFLICKR set. MIR 2010 - 11th ACM International Conference on Multimedia Information Retrieval, Mar 2010, Philadephia, United States. ACM Press, pp.537-546, 2010, <10.1145/1743384.1743476>. <inria-00548628v2>

Partager

Métriques

Consultations de
la notice

732

Téléchargements du document

2411