CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination

Matthieu Dorier 1, 2 Gabriel Antoniu 1 Robert Ross 3 Dries Kimpe 3 Shadi Ibrahim 1
1 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
3 MCS
ANL - Argonne National Laboratory [Lemont]
Abstract : Unmatched computation and storage performance in new HPC systems have led to a plethora of I/O optimizations ranging from application-side collective I/O to network and disk-level request scheduling on the file system side. As we deal with ever larger machines, the interferences produced by multiple applications accessing a shared parallel file system in a concurrent manner become a major problem. These interferences often break single-application I/O optimizations, dramatically degrading application I/O performance and, as a result, lowering machine wide efficiency. This paper focuses on CALCioM, a framework that aims to mitigate I/O interference through the dynamic selection of appropriate scheduling policies. CALCioM allows several applications running on a supercomputer to communicate and coordinate their I/O strategy in order to avoid interfering with one another. In this work, we examine four I/O strategies that can be accommodated in this framework: serializing, interrupting, interfering and coordinating. Experiments on Argonne's BG/P Surveyor machine and on several clusters of the French Grid'5000 show how CALCioM can be used to efficiently and transparently improve the scheduling strategy between two otherwise interfering applications, given specified metrics of machine wide efficiency.
Type de document :
Communication dans un congrès
IPDPS - International Parallel and Distributed Processing Symposium, May 2014, Phoenix, United States. 2014
Liste complète des métadonnées


https://hal.inria.fr/hal-00916091
Contributeur : Matthieu Dorier <>
Soumis le : mercredi 9 avril 2014 - 10:32:06
Dernière modification le : mercredi 2 août 2017 - 10:06:34
Document(s) archivé(s) le : mercredi 9 juillet 2014 - 10:50:54

Fichier

CALCioM.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00916091, version 1

Citation

Matthieu Dorier, Gabriel Antoniu, Robert Ross, Dries Kimpe, Shadi Ibrahim. CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination. IPDPS - International Parallel and Distributed Processing Symposium, May 2014, Phoenix, United States. 2014. <hal-00916091>

Partager

Métriques

Consultations de
la notice

1631

Téléchargements du document

381