On Exploiting Data Locality for Iterative Mapreduce Applications in Hybrid Clouds

Abstract : Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the capacity during peak utilization), has made significant impact especially for big data analytics, where the explosion of data sizes and increasingly complex computations frequently leads to insufficient local data center capacity. Cloud bursting however introduces a major challenge to runtime systems due to the limited throughput and high latency of data transfers between on-premise and off-premise resources (weak link). This issue and how to address it is not well understood. We contribute with a comprehensive study on what challenges arise in this context, what potential strategies can be applied to address them and what best practices can be leveraged in real-life. Specifically, we focus our study on iterative MapReduce applications , which are a class of large-scale data intensive applications particularly popular on hybrid clouds. In this context, we study how data locality can be leveraged over the weak link both from the storage layer perspective (when and how to move it off-premise) and from the scheduling perspective (when to compute off-premise). We conclude with a brief discussion on how to set up an experimental framework suitable to study the effectiveness of our proposal in future work.
Type de document :
Communication dans un congrès
BDCAT'16: 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, Dec 2016, Shanghai, China. pp.118 - 122, 2016, 〈10.1145/3006299.3006329〉
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01476052
Contributeur : Bogdan Nicolae <>
Soumis le : vendredi 24 février 2017 - 15:12:02
Dernière modification le : jeudi 1 juin 2017 - 16:27:34

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Francisco Clemente-Castello, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernandez, M. Mustafa Rafique. On Exploiting Data Locality for Iterative Mapreduce Applications in Hybrid Clouds. BDCAT'16: 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, Dec 2016, Shanghai, China. pp.118 - 122, 2016, 〈10.1145/3006299.3006329〉. 〈hal-01476052〉

Partager

Métriques

Consultations de la notice

26

Téléchargements de fichiers

100