Skip to Main content Skip to Navigation
Journal articles

On-the-fly scheduling vs. reservation-based scheduling for unpredictable workflows

Abstract : Scientific insights in the coming decade will clearly depend on the effective processing of large datasets generated by dynamic heterogeneous applications typical of workflows in large data centers or of emerging fields like neuroscience. In this paper, we show how these big data workflows have a unique set of characteristics that pose challenges for leveraging HPC methodologies, particularly in scheduling. Our findings indicate that execution times for these workflows are highly unpredictable and are not correlated with the size of the dataset involved or the precise functions used in the analysis. We characterize this inherent variability and sketch the need for new scheduling approaches by quantifying significant gaps in achievable performance. Through simulations, we show how on-the-fly scheduling approaches can deliver benefits in both system-level and user-level performance measures. On average, we find improvements of up to 35% in system utilization and up to 45% in average stretch of the applications, illustrating the potential of increasing performance through new scheduling approaches.
Complete list of metadata

Cited literature [37 references]  Display  Hide  Download
Contributor : Guillaume Pallez (aupy) Connect in order to contact the contributor
Submitted on : Tuesday, March 5, 2019 - 9:44:21 PM
Last modification on : Thursday, November 14, 2019 - 10:22:31 AM
Long-term archiving on: : Thursday, June 6, 2019 - 6:30:14 PM


Files produced by the author(s)




Ana Gainaru, Hongyang Sun, Guillaume Aupy, Yuankai Huo, Bennett Landman, et al.. On-the-fly scheduling vs. reservation-based scheduling for unpredictable workflows. International Journal of High Performance Computing Applications, SAGE Publications, In press, ⟨10.1177/1094342019841681⟩. ⟨hal-02058290⟩



Record views


Files downloads