Self-configuration of the Number of Concurrently Running MapReduce Jobs in a Hadoop Cluster

Abstract : There is a trade-off between the number of concurrently running MapReduce jobs and their corresponding map and reduce tasks within a node in a Hadoop cluster. Leaving this trade-off statically configured to a single value can significantly reduce job response times leaving only suboptimal resource usage. To overcome this problem, we propose a feedback control loop based approach that dynamically adjusts the Hadoop resource manager configuration based on the current state of the cluster. The preliminary assessment based on workloads synthesized from real-world traces shows that the system performance can be improved by about 30% compared to default Hadoop setup.
Document type :
Conference papers
Complete list of metadatas

Cited literature [5 references]  Display  Hide  Download

https://hal.inria.fr/hal-01143157
Contributor : Bo Zhang <>
Submitted on : Wednesday, May 6, 2015 - 11:24:44 AM
Last modification on : Thursday, April 4, 2019 - 10:18:05 AM
Long-term archiving on : Wednesday, April 19, 2017 - 6:26:20 PM

File

icac15-paper.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01143157, version 1

Citation

Bo Zhang, Filip Křikava, Romain Rouvoy, Lionel Seinturier. Self-configuration of the Number of Concurrently Running MapReduce Jobs in a Hadoop Cluster. ICAC 2015, Jul 2015, Grenoble, France. pp.149-150. ⟨hal-01143157⟩

Share

Metrics

Record views

542

Files downloads

375