Skip to Main content Skip to Navigation
New interface
Conference papers

On the Usability of Shortest Remaining Time First Policy in Shared Hadoop Clusters

Nathanaël Cheriere 1 Pierre Donat-Bouillud 1 Shadi Ibrahim 2 Matthieu Simonin 3 
2 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
3 MYRIADS - Design and Implementation of Autonomous Distributed Systems
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
Abstract : Hadoop has been recently used to process a diverse variety of applications, sharing the same execution infrastructure. A practical problem facing the Hadoop community is how to reduce job makespans by reducing job waiting times and ex- ecution times. Previous Hadoop schedulers have focused on improving job execution times, by improving data locality but not considering job waiting times. Even worse, enforcing data locality according to the job input sizes can be ineffi- cient: it can lead to long waiting times for small yet short jobs when sharing the cluster with jobs with smaller input sizes but higher execution complexity. This paper presents hSRTF, an adaption of the well-known Shortest Remaining Time First scheduler (i.e., SRTF) in shared Hadoop clus- ters. hSRTF embraces a simple model to estimate the re- maining time of a job and a preemption primitive (i.e., kill) to free the resources when needed. We have implemented hSRTF and performed extensive evaluations with Hadoop on the Grid’5000 testbed. The results show that hSRTF can significantly reduce the waiting times of small jobs and therefore improves their makespans, but at the cost of a rel- atively small increase in the makespans of large jobs. For instance, a time-based proportional share mode of hSRTF (i.e., hSRTF-Pr) speeds up small jobs by (on average) 45% and 26% while introducing a performance degradation for large jobs by (on average) 10% and 0.2% compared to Fifo and Fair schedulers, respectively.
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Shadi Ibrahim Connect in order to contact the contributor
Submitted on : Wednesday, February 3, 2016 - 12:04:10 PM
Last modification on : Saturday, June 25, 2022 - 7:46:08 PM
Long-term archiving on: : Saturday, November 12, 2016 - 5:48:33 AM


Files produced by the author(s)



Nathanaël Cheriere, Pierre Donat-Bouillud, Shadi Ibrahim, Matthieu Simonin. On the Usability of Shortest Remaining Time First Policy in Shared Hadoop Clusters. SAC 2016-The 31st ACM/SIGAPP Symposium on Applied Computing, Apr 2016, Pisa, Italy. ⟨10.1145/2851613.2851626⟩. ⟨hal-01239341⟩



Record views


Files downloads