Distributed High-Dimensional Index Creation using Hadoop, HDFS and C++

Gylfi Þór Guðmundsson 1 Laurent Amsaleg 1 Björn Þór Jónsson 2
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper presents an initial study where the creation of a high-dimensional index is made parallel and distributed by using the Hadoop framework. Early experimental results show substantial performance gains, despite the fact that the Hadoop framework is loosely coupled to the C++ based index creation. Two main lessons can be drawn from this work: (i)~it is key to invest time, energy and manpower to re-implement the code in Java in order to benefit from all the features of Hadoop---although our results are already impressive, even better performance gains will be observed if the index creation is re-implemented in Java; and (ii)~special care must be taken to account for the networking topology to prevent message exchanges from becoming the new bottleneck, when parallelism fixes the CPU bottleneck and HDFS the I/O bottleneck.
Type de document :
Communication dans un congrès
CBMI - 10th Workshop on Content-Based Multimedia Indexing, Jun 2012, Annecy, France. 2012, 〈10.1109/CBMI.2012.6269848〉
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00764434
Contributeur : Laurent Amsaleg <>
Soumis le : jeudi 13 décembre 2012 - 09:19:59
Dernière modification le : jeudi 11 janvier 2018 - 06:20:10
Document(s) archivé(s) le : jeudi 14 mars 2013 - 03:46:21

Fichier

decp.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Gylfi Þór Guðmundsson, Laurent Amsaleg, Björn Þór Jónsson. Distributed High-Dimensional Index Creation using Hadoop, HDFS and C++. CBMI - 10th Workshop on Content-Based Multimedia Indexing, Jun 2012, Annecy, France. 2012, 〈10.1109/CBMI.2012.6269848〉. 〈hal-00764434〉

Partager

Métriques

Consultations de la notice

359

Téléchargements de fichiers

273