Web-scale image clustering revisited

Ioannis Z. Emiris; Yannis Avrithis; Yannis Kalantidis; Evangelos Anagnostopoulos

Communication Dans Un Congrès Année : 2015

Web-scale image clustering revisited

(1, 2, 3) , (4) , (5) , (1)

1
2
3
4
5

Ioannis Z. Emiris

Fonction : Auteur
PersonId : 10682
IdHAL : ioannis-emiris
ORCID : 0000-0002-2339-5303
IdRef : 092244807

National and Kapodistrian University of Athens

Inria Sophia Antipolis - Méditerranée

AlgebRe, geOmetrie, Modelisation et AlgoriTHmes

Yannis Avrithis

Fonction : Auteur
PersonId : 20705
IdHAL : yannis-avrithis
ORCID : 0000-0001-7476-4482
IdRef : 253126193

Creating and exploiting explicit links between multimedia fragments

Yannis Kalantidis

Fonction : Auteur

Yahoo Inc.

Evangelos Anagnostopoulos

Fonction : Auteur

National and Kapodistrian University of Athens

Résumé

Large scale duplicate detection, clustering and mining of documents or images has been conventionally treated with seed detection via hashing, followed by seed growing heuristics using fast search. Principled clustering methods , especially kernelized and spectral ones, have higher complexity and are difficult to scale above millions. Under the assumption of documents or images embedded in Eu-clidean space, we revisit recent advances in approximate k-means variants, and borrow their best ingredients to introduce a new one, inverted-quantized k-means (IQ-means). Key underlying concepts are quantization of data points and multi-index based inverted search from centroids to cells. Its quantization is a form of hashing and analogous to seed detection, while its updates are analogous to seed growing, yet principled in the sense of distortion minimization. We further design a dynamic variant that is able to determine the number of clusters k in a single run at nearly zero additional cost. Combined with powerful deep learned representations , we achieve clustering of a 100 million image collection on a single machine in less than one hour.

Mots clés

Approximation theory Data mining Document handling Image processing Minimisation Pattern clustering Search problems

Domaines

Intelligence artificielle [cs.AI] Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

Avrithis_Web-Scale_Image_Clustering_ICCV_2015.pdf (1.18 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Ioannis Emiris : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01990662

Soumis le : mercredi 23 janvier 2019-12:21:19

Dernière modification le : vendredi 24 mars 2023-14:53:09

Archivage à long terme le : mercredi 24 avril 2019-13:39:21

Dates et versions

hal-01990662 , version 1 (23-01-2019)

Identifiants

HAL Id : hal-01990662 , version 1

Citer

Ioannis Z. Emiris, Yannis Avrithis, Yannis Kalantidis, Evangelos Anagnostopoulos. Web-scale image clustering revisited. ICCV 2015 - International Conference on Computer Vision, Dec 2015, Santiago, Chile. ⟨hal-01990662⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA CENTRALESUPELEC INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-COTEDAZUR UNIV-RENNES UR1-MATH-NUM

82 Consultations

128 Téléchargements

Web-scale image clustering revisited

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager