On-the-fly Task Execution for Speeding Up Pipelined MapReduce

Diana Moise 1 Gabriel Antoniu 1 Luc Bougé 1
1 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
Abstract : The MapReduce programming model is widely acclaimed as a key solution to designing data-intensive applications. However, many of the computations that fit this model cannot be expressed as a single MapReduce execution, but require a more complex design. Such applications consisting of multiple jobs chained into a long-running execution are called pipeline MapReduce applications. Standard MapReduce frameworks are not optimized for the specific requirements of pipeline applications, yielding performance issues. In order to optimize the execution on pipelined MapReduce, we propose a mechanism for creating map tasks along the pipeline, as soon as their input data becomes available. We implemented our approach in the Hadoop MapReduce framework. The benefits of our dynamic task scheduling are twofold: reducing job-completion time and increasing cluster utilization by involving more resources in the computation. Experimental evaluation performed on the Grid'5000 testbed, shows that our approach delivers performance gains between 9% and 32%.
Type de document :
Communication dans un congrès
Euro-Par - 18th International European Conference on Parallel and Distributed Computing - 2012, Aug 2012, Rhodes Island, Greece. 2012
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00706844
Contributeur : Diana Moise <>
Soumis le : lundi 11 juin 2012 - 16:06:36
Dernière modification le : mercredi 16 mai 2018 - 11:23:28
Document(s) archivé(s) le : mercredi 12 septembre 2012 - 02:32:42

Fichier

main.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00706844, version 1

Citation

Diana Moise, Gabriel Antoniu, Luc Bougé. On-the-fly Task Execution for Speeding Up Pipelined MapReduce. Euro-Par - 18th International European Conference on Parallel and Distributed Computing - 2012, Aug 2012, Rhodes Island, Greece. 2012. 〈hal-00706844〉

Partager

Métriques

Consultations de la notice

536

Téléchargements de fichiers

187