Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

Abstract : Reading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on current and future supercomputers. Data aggregation is an efficient approach consisting of electing some processes in charge of aggregating data from a set of neighbors and writing the aggregated data into storage. Thus, the bandwidth use can be optimized while the contention is reduced. In this work, we take into account the network topology for mapping aggregators and we propose an optimized buffering system in order to reduce the aggregation cost. We validate our approach using micro-benchmarks and the I/O kernel of a large-scale cosmology simulation. We show improvements up to 15× faster for I/O operations compared to a standard implementation of MPI I/O.
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/hal-01394741
Contributor : Emmanuel Jeannot <>
Submitted on : Monday, November 14, 2016 - 11:40:16 AM
Last modification on : Tuesday, May 28, 2019 - 10:47:27 AM
Long-term archiving on : Wednesday, March 15, 2017 - 3:52:47 AM

File

topoIO-paper.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01394741, version 1

Citation

François Tessier, Preeti Malakar, Venkatram Vishwanath, Emmanuel Jeannot, Florin Isaila. Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers. COM-HPC 2016 - 1st Workshop on Optimization of Communication in HPC runtime systems IEEE, Nov 2016, Salt-Lake City, United States. pp.73-81. ⟨hal-01394741⟩

Share

Metrics

Record views

349

Files downloads

613