Skip to Main content Skip to Navigation
Journal articles

I/O scheduling strategy for periodic applications

Abstract : With the ever-growing need of data in HPC applications, the congestion at the I/O level becomes critical in supercomputers. Architectural enhancement such as burst buffers and pre-fetching are added to machines, but are not sufficient to prevent congestion. Recent online I/O scheduling strategies have been put in place, but they add an additional congestion point and overheads in the computation of applications. In this work, we show how to take advantage of the periodic nature of HPC applications in order to develop efficient periodic scheduling strategies for their I/O transfers. Our strategy computes once during the job scheduling phase a pattern which defines the I/O behavior for each application, after which the applications run independently, performing their I/O at the specified times. Our strategy limits the amount of congestion at the I/O node level and can be easily integrated into current job schedulers. We validate this model through extensive simulations and experiments on an HPC cluster by comparing it to state-of-the-art online solutions, showing that not only does our scheduler have the advantage of being decentralized and thus overcoming the overhead of online schedulers, but also that it performs better than the other solutions, improving the application dilation up to 16% and the maximum system efficiency up to 18%.
Complete list of metadatas

Cited literature [43 references]  Display  Hide  Download

https://hal.inria.fr/hal-02141576
Contributor : Guillaume Pallez (aupy) <>
Submitted on : Tuesday, May 28, 2019 - 1:06:40 PM
Last modification on : Wednesday, November 20, 2019 - 7:54:21 AM

File

topc.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Ana Gainaru, Valentin Le Fèvre, Guillaume Pallez. I/O scheduling strategy for periodic applications. ACM Transactions on Parallel Computing, Association for Computing Machinery, In press, ⟨10.1145/nnnnnnn.nnnnnnn⟩. ⟨hal-02141576⟩

Share

Metrics

Record views

74

Files downloads

493