Web-scale image clustering revisited

Ioannis Z. Emiris; Yannis Avrithis; Yannis Kalantidis; Evangelos Anagnostopoulos

Conference Papers Year : 2015

Web-scale image clustering revisited

(1, 2, 3) , (4) , (5) , (1)

1
2
3
4
5

Ioannis Z. Emiris

Function : Author
PersonId : 10682
IdHAL : ioannis-emiris
ORCID : 0000-0002-2339-5303
IdRef : 092244807

National and Kapodistrian University of Athens

Inria Sophia Antipolis - Méditerranée

AlgebRe, geOmetrie, Modelisation et AlgoriTHmes

Yannis Avrithis

Function : Author
PersonId : 20705
IdHAL : yannis-avrithis
ORCID : 0000-0001-7476-4482
IdRef : 253126193

Creating and exploiting explicit links between multimedia fragments

Yannis Kalantidis

Function : Author

Yahoo Inc.

Evangelos Anagnostopoulos

Function : Author

National and Kapodistrian University of Athens

Abstract

Large scale duplicate detection, clustering and mining of documents or images has been conventionally treated with seed detection via hashing, followed by seed growing heuristics using fast search. Principled clustering methods , especially kernelized and spectral ones, have higher complexity and are difficult to scale above millions. Under the assumption of documents or images embedded in Eu-clidean space, we revisit recent advances in approximate k-means variants, and borrow their best ingredients to introduce a new one, inverted-quantized k-means (IQ-means). Key underlying concepts are quantization of data points and multi-index based inverted search from centroids to cells. Its quantization is a form of hashing and analogous to seed detection, while its updates are analogous to seed growing, yet principled in the sense of distortion minimization. We further design a dynamic variant that is able to determine the number of clusters k in a single run at nearly zero additional cost. Combined with powerful deep learned representations , we achieve clustering of a 100 million image collection on a single machine in less than one hour.

Keywords

Approximation theory Data mining Document handling Image processing Minimisation Pattern clustering Search problems

Domains

Artificial Intelligence [cs.AI] Computer Vision and Pattern Recognition [cs.CV]

Fichier principal

Avrithis_Web-Scale_Image_Clustering_ICCV_2015.pdf (1.18 Mo)

Origin : Files produced by the author(s)

Ioannis Emiris : Connect in order to contact the contributor

https://inria.hal.science/hal-01990662

Submitted on : Wednesday, January 23, 2019-12:21:19 PM

Last modification on : Friday, March 24, 2023-2:53:09 PM

Long-term archiving on: Wednesday, April 24, 2019-1:39:21 PM

Dates and versions

hal-01990662 , version 1 (23-01-2019)

Identifiers

HAL Id : hal-01990662 , version 1

Cite

Ioannis Z. Emiris, Yannis Avrithis, Yannis Kalantidis, Evangelos Anagnostopoulos. Web-scale image clustering revisited. ICCV 2015 - International Conference on Computer Vision, Dec 2015, Santiago, Chile. ⟨hal-01990662⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA CENTRALESUPELEC INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-COTEDAZUR UNIV-RENNES UR1-MATH-NUM

83 View

140 Download

Web-scale image clustering revisited

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share