On Exploiting Data Locality for Iterative Mapreduce Applications in Hybrid Clouds

Abstract : Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the capacity during peak utilization), has made significant impact especially for big data analytics, where the explosion of data sizes and increasingly complex computations frequently leads to insufficient local data center capacity. Cloud bursting however introduces a major challenge to runtime systems due to the limited throughput and high latency of data transfers between on-premise and off-premise resources (weak link). This issue and how to address it is not well understood. We contribute with a comprehensive study on what challenges arise in this context, what potential strategies can be applied to address them and what best practices can be leveraged in real-life. Specifically, we focus our study on iterative MapReduce applications , which are a class of large-scale data intensive applications particularly popular on hybrid clouds. In this context, we study how data locality can be leveraged over the weak link both from the storage layer perspective (when and how to move it off-premise) and from the scheduling perspective (when to compute off-premise). We conclude with a brief discussion on how to set up an experimental framework suitable to study the effectiveness of our proposal in future work.
Complete list of metadatas

Cited literature [26 references]  Display  Hide  Download

https://hal.inria.fr/hal-01476052
Contributor : Bogdan Nicolae <>
Submitted on : Friday, February 24, 2017 - 3:12:02 PM
Last modification on : Thursday, June 1, 2017 - 4:27:34 PM

File

paper.pdf
Files produced by the author(s)

Identifiers

Citation

Francisco Clemente-Castello, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernandez, M. Mustafa Rafique. On Exploiting Data Locality for Iterative Mapreduce Applications in Hybrid Clouds. BDCAT'16: 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, Dec 2016, Shanghai, China. pp.118 - 122, ⟨10.1145/3006299.3006329⟩. ⟨hal-01476052⟩

Share

Metrics

Record views

72

Files downloads

271