Indexing and Searching 100M Images with Map-Reduce

Diana Moise 1 Denis Shestakov 1 Gylfi Thór Gudmundsson 1 Laurent Amsaleg 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Most researchers working on high-dimensional indexing agree on the following three trends: (i) the size of the multimedia collections to index are now reaching millions if not billions of items, (ii) the computers we use every day now come with multiple cores and (iii) hardware becomes more available, thanks to easier access to Grids and/or Clouds. This paper shows how the Map-Reduce paradigm can be applied to indexing algorithms and demonstrates that great scalability can be achieved using Hadoop, a popular Map-Reduce-based framework. Dramatic performance improvements are not however guaranteed a priori: such frameworks are rigid, they severely constrain the possible access patterns to data and scares resource RAM has to be shared. Furthermore, algorithms require major redesign, and may have to settle for sub-optimal behavior. The benefits, however, are many: simplicity for programmers, automatic distribution, fault tolerance, failure detection and automatic re-runs and, last but not least, scalability. We share our experience of adapting a clustering-based high-dimensional indexing algorithm to the Map-Reduce model, and of testing it at large scale with Hadoop as we index 30 billion SIFT descriptors. We foresee that lessons drawn from our work could minimize time, effort and energy invested by other researchers and practitioners working in similar directions.
Type de document :
Communication dans un congrès
ACM International Conference on Multimedia Retrieval, Apr 2013, Dallas, United States. 2013
Liste complète des métadonnées

Littérature citée [23 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00796475
Contributeur : Laurent Amsaleg <>
Soumis le : lundi 4 mars 2013 - 11:39:25
Dernière modification le : mercredi 16 mai 2018 - 11:23:06
Document(s) archivé(s) le : mercredi 5 juin 2013 - 03:56:43

Fichier

icmr115-moise.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00796475, version 1

Citation

Diana Moise, Denis Shestakov, Gylfi Thór Gudmundsson, Laurent Amsaleg. Indexing and Searching 100M Images with Map-Reduce. ACM International Conference on Multimedia Retrieval, Apr 2013, Dallas, United States. 2013. 〈hal-00796475〉

Partager

Métriques

Consultations de la notice

519

Téléchargements de fichiers

5706