Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce

Abstract : Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the overall capacity during peak utilization) is a popular and cost-effective way to deal with the increasing complexity of big data analytics. It is particularly promising for iterative MapReduce applications that reuse massive amounts of input data at each iteration, which compensates for the high overhead and cost of concurrent data transfers from the on-premise to the off-premise VMs over a weak inter-site link that is of limited capacity. In this paper we study how to combine various MapReduce data locality techniques designed for hybrid cloud bursting in order to achieve scalability for iterative MapReduce applications in a cost-effective fashion. This is a non-trivial problem due to the complex interaction between the data movements over the weak link and the scheduling of computational tasks that have to adapt to the shifting data distribution. We show that using the right combination of techniques, iterative MapReduce applications can scale well in a hybrid cloud bursting scenario and come even close to the scalability observed in single sites.
Type de document :
Communication dans un congrès
CCGrid’17: 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2017, Madrid, Spain
Liste complète des métadonnées

Littérature citée [15 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01469991
Contributeur : Bogdan Nicolae <>
Soumis le : vendredi 17 février 2017 - 00:31:59
Dernière modification le : vendredi 17 février 2017 - 12:03:28
Document(s) archivé(s) le : jeudi 18 mai 2017 - 13:02:04

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01469991, version 1

Citation

Francisco Clemente-Castello, Bogdan Nicolae, M. Mustafa Rafique, Rafael Mayo, Juan Carlos Fernandez. Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce. CCGrid’17: 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2017, Madrid, Spain. 〈hal-01469991〉

Partager

Métriques

Consultations de la notice

73

Téléchargements de fichiers

186