Skip to Main content Skip to Navigation
New interface
Journal articles

Optimization of Real-World MapReduce Applications With Flame-MR: Practical Use Cases

Jorge Veiga 1 Roberto R Expósito 1 Bruno Raffin 2 Juan Tourino 1 
2 DATAMOVE - Data Aware Large Scale Computing
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : Apache Hadoop is a widely used MapReduce framework for storing and processing large amounts of data. However, it presents some performance issues that hinder its utilization in many practical use cases. Although existing alternatives like Spark or Hama can outperform Hadoop, they require to rewrite the source code of the applications due to API incompatibilities. This paper studies the use of Flame-MR, an in-memory processing architecture for MapReduce applications, to improve the performance of real-world use cases in a transparent way while keeping application compatibility. Flame-MR adapts to the characteristics of the workloads, managing efficiently the use of custom data formats and iterative computations, while also reducing workload imbalance. The experimental evaluation, conducted in high performance clusters and the Microsoft Azure cloud, shows a clear outperformance of Flame-MR over Hadoop. In most cases, Flame-MR reduces the execution times by more than a half.
Complete list of metadata

Cited literature [29 references]  Display  Hide  Download
Contributor : Bruno Raffin Connect in order to contact the contributor
Submitted on : Friday, December 14, 2018 - 1:49:47 PM
Last modification on : Friday, July 8, 2022 - 10:05:38 AM
Long-term archiving on: : Friday, March 15, 2019 - 4:08:08 PM


Files produced by the author(s)



Jorge Veiga, Roberto R Expósito, Bruno Raffin, Juan Tourino. Optimization of Real-World MapReduce Applications With Flame-MR: Practical Use Cases. IEEE Access, 2018, 6, pp.69750-69762. ⟨10.1109/ACCESS.2018.2880842⟩. ⟨hal-01955503⟩



Record views


Files downloads