Periodic I/O scheduling for super-computers

Guillaume Aupy 1 Ana Gainaru 2 Valentin Le Fèvre 3
1 TADAAM - Topology-Aware System-Scale Data Management for High-Performance Computing
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
3 ROMA - Optimisation des ressources : modèles, algorithmes et ordonnancement
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : With the ever-growing need of data in HPC applications, the congestion at the I/O level becomes critical in super-computers. Architectural enhancement such as burst-buffers and pre-fetching are added to machines, but are not sufficient to prevent congestion. Recent online I/O scheduling strategies have been put in place, but they add an additional congestion point and overheads in the computation of applications. In this work, we show how to take advantage of the periodic nature of HPC applications in order to develop efficient periodic scheduling strategies for their I/O transfers. Our strategy computes once during the job scheduling phase a pattern where it defines the I/O behavior for each application, after which the applications run independently, transferring their I/O at the specified times. Our strategy limits the amount of I/O congestion at the I/O node level and can be easily integrated into current job schedulers. We validate this model through extensive simulations and experiments by comparing it to state-of-the-art online solutions. Specifically, we show that not only our scheduler has the advantage of being decentralized , thus overcoming the overhead of online schedulers, but we also show that on Mira one can expect an average dilation improvement of 22% with an average throughput improvement of 32%! Finally, we show that one can expect those improvements to get better in the next generation of platforms where the compute-I/O bandwidth imbalance increases.
Type de document :
Communication dans un congrès
PMBS 2017 - 8th International Workshop High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, Nov 2017, Denver (CO), United States. pp.1-22
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01654645
Contributeur : Guillaume Aupy <>
Soumis le : lundi 4 décembre 2017 - 11:24:39
Dernière modification le : vendredi 20 avril 2018 - 15:44:27

Fichier

pmbs.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01654645, version 1

Citation

Guillaume Aupy, Ana Gainaru, Valentin Le Fèvre. Periodic I/O scheduling for super-computers. PMBS 2017 - 8th International Workshop High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, Nov 2017, Denver (CO), United States. pp.1-22. 〈hal-01654645〉

Partager

Métriques

Consultations de la notice

387

Téléchargements de fichiers

34