Skip to Main content Skip to Navigation
Conference papers

TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers

Abstract : Reading and writing data efficiently from storage system is necessary for most scientific simulations to achieve good performance at scale. Many software solutions have been developed to decrease the I/O bottleneck. One well-known strategy, in the context of collective I/O operations, is the two-phase I/O scheme. This strategy consists of selecting a subset of processes to aggregate contiguous pieces of data before performing reads/writes. In this paper, we present TAPIOCA, an MPI-based library implementing an efficient topology-aware two-phase I/O algorithm. We show how TAPIOCA can take advantage of double-buffering and one-sided communication to reduce as much as possible the idle time during data aggregation. We also introduce our cost model leading to a topology-aware aggregator placement optimizing the movements of data. We validate our approach at large scale on two leadership-class supercomputers: Mira (IBM BG/Q) and Theta (Cray XC40). We present the results obtained with TAPIOCA on a micro-benchmark and the I/O kernel of a large-scale simulation. On both architectures, we show a substantial improvement of I/O performance compared with the default MPI I/O implementation. On BG/Q+GPFS, for instance, our algorithm leads to a performance improvement by a factor of twelve while on the Cray XC40 system associated with a Lustre filesystem, we achieve an improvement of four.
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download
Contributor : Emmanuel Jeannot Connect in order to contact the contributor
Submitted on : Tuesday, October 24, 2017 - 10:13:59 AM
Last modification on : Monday, November 30, 2020 - 4:18:05 PM
Long-term archiving on: : Thursday, January 25, 2018 - 12:23:27 PM


Files produced by the author(s)




François Tessier, Venkatram Vishwanath, Emmanuel Jeannot. TAPIOCA: An I/O Library for Optimized Topology-Aware Data Aggregation on Large-Scale Supercomputers. CLUSTER 2017 - IEEE International Conference on Cluster Computing, Sep 2017, Honolulu, United States. pp.1-11, ⟨10.1109/CLUSTER.2017.80⟩. ⟨hal-01621344⟩



Record views


Files downloads