Bringing Elastic MapReduce to Scientific Clouds

Abstract : The MapReduce programming model, proposed by Google, offers a simple and efficient way to perform distributed computation over large data sets. The Apache Hadoop framework is a free and open-source implementation of MapReduce. To simplify the usage of Hadoop, Amazon Web Services provides Elastic MapReduce, a web service that enables users to submit MapReduce jobs. Elastic MapReduce takes care of resource provisioning, Hadoop configuration and performance tuning, data staging, fault tolerance, etc. This service drastically reduces the entry barrier to perform MapReduce computations in the cloud. However, Elastic MapReduce is limited to using Amazon EC2 resources, and requires an extra fee. In this paper, we present our work towards creating an implementation of Elastic MapReduce which is able to use resources from other clouds than Amazon EC2, such as scientific clouds. This work will also serve as a foundation for more advanced experiments, such as performing MapReduce computations over multiple distributed clouds.
Complete list of metadatas

Cited literature [5 references]  Display  Hide  Download

https://hal.inria.fr/inria-00624263
Contributor : Pierre Riteau <>
Submitted on : Friday, September 16, 2011 - 11:31:49 AM
Last modification on : Friday, November 16, 2018 - 1:40:30 AM
Long-term archiving on : Monday, December 5, 2016 - 2:36:37 AM

File

main.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00624263, version 1

Citation

Pierre Riteau, Kate Keahey, Christine Morin. Bringing Elastic MapReduce to Scientific Clouds. 3rd Annual Workshop on Cloud Computing and Its Applications: Poster Session, Apr 2011, Argonne, Illinois, United States. ⟨inria-00624263⟩

Share

Metrics

Record views

771

Files downloads

367