Skip to Main content Skip to Navigation
Conference papers

Sim-Min-Hash: An efficient matching technique for linking large image collections

Wan-Lei Zhao 1, * Hervé Jégou 1 Guillaume Gravier 1
* Corresponding author
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : One of the most successful method to link all similar images within a large collection is min-Hash, which is a way to significantly speed-up the comparison of images when the underlying image representation is bag-of-words. However, the quantization step of min-Hash introduces important information loss. In this paper, we propose a generalization of min-Hash, called Sim-min-Hash, to compare sets of real-valued vectors. We demonstrate the effectiveness of our approach when combined with the Hamming embedding similarity. Experiments on large-scale popular benchmarks demonstrate that Sim-min-Hash is more accurate and faster than min-Hash for similar image search. Linking a collection of one million images described by 2 billion local descriptors is done in 7 minutes on a single core machine.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download
Contributor : Hervé Jégou <>
Submitted on : Monday, August 5, 2013 - 11:28:07 AM
Last modification on : Friday, July 10, 2020 - 4:08:04 PM
Long-term archiving on: : Wednesday, April 5, 2017 - 7:20:37 PM


Files produced by the author(s)


  • HAL Id : hal-00839921, version 3


Wan-Lei Zhao, Hervé Jégou, Guillaume Gravier. Sim-Min-Hash: An efficient matching technique for linking large image collections. ACM Multimedia, Oct 2013, Barcelona, Spain. ⟨hal-00839921v3⟩