Dynamic Scheduling of MapReduce Shuffle under Bandwidth Constraints

Abstract : Whether it is for e-science or business, the amount of data produced every year is growing at a high rate. Managing and processing those data raises new challenges. MapReduce is one answer to the need for scalable tools able to handle the amount of data. It imposes a general structure of computation and let the implementation perform its optimizations. During the computation, there is a phase called Shuffle where every node sends a possibly large amount of data to every other node. This report proposes and evaluates six algorithms to improve data transfers during the Shuffle phase under bandwidth constraints.
Complete list of metadatas

Cited literature [37 references]  Display  Hide  Download

https://hal.inria.fr/hal-01254055
Contributor : Christian Perez <>
Submitted on : Monday, January 11, 2016 - 4:51:49 PM
Last modification on : Friday, April 20, 2018 - 3:44:26 PM
Long-term archiving on : Tuesday, April 12, 2016 - 11:36:57 AM

File

RR-8574.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01254055, version 1

Collections

Citation

Sylvain Gault, Frédéric Desprez. Dynamic Scheduling of MapReduce Shuffle under Bandwidth Constraints. [Research Report] 8574, Inria. 2014, pp.38. ⟨hal-01254055⟩

Share

Metrics

Record views

337

Files downloads

309