TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Abstract : Reading and writing data efficiently from storage system is necessary for most scientific simulations to achieve good performance at scale. Many software solutions have been developed to decrease the I/O bottleneck. One well-known strategy, in the context of collective I/O operations, is the two-phase I/O scheme. This strategy consists of selecting a subset of processes to aggregate contiguous pieces of data before performing reads/writes. In this paper, we present TAPIOCA, an MPI-based library implementing an efficient topology-aware two-phase I/O algorithm. We show how TAPIOCA can take advantage of double-buffering and one-sided communication to reduce as much as possible the idle time during data aggregation. We also introduce our cost model leading to a topology-aware aggregator placement optimizing the movements of data. We validate our approach at large scale on two leadership-class supercomputers: Mira (IBM BG/Q) and Theta (Cray XC40). We present the results obtained with TAPIOCA on a micro-benchmark and the I/O kernel of a large-scale simulation. On both architectures, we show a substantial improvement of I/O performance compared with the default MPI I/O implementation. On BG/Q+GPFS, for instance, our algorithm leads to a performance improvement by a factor of twelve while on the Cray XC40 system associated with a Lustre filesystem, we achieve an improvement of four.
Type de document :
Communication dans un congrès
CLUSTER 2017 - IEEE International Conference on Cluster Computing, Sep 2017, Honolulu, United States. IEEE, pp.1-11, 2017, 〈10.1109/CLUSTER.2017.80〉
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01621344
Contributeur : Emmanuel Jeannot <>
Soumis le : mardi 24 octobre 2017 - 10:13:59
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21

Fichier

paper_version_publiée.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

François Tessier, Venkatram Vishwanath, Emmanuel Jeannot. TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers. CLUSTER 2017 - IEEE International Conference on Cluster Computing, Sep 2017, Honolulu, United States. IEEE, pp.1-11, 2017, 〈10.1109/CLUSTER.2017.80〉. 〈hal-01621344〉

Partager

Métriques

Consultations de la notice

73

Téléchargements de fichiers

11