Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study

Abstract : Finely tuning MPI applications (number of processes, granularity, collective operation algorithms, topology and process placement) is critical to obtain good performance on supercomputers. With a rising cost of modern supercomputers, running parallel applications at scale solely to optimize their performance is extremely expensive. Having inexpensive but faithful predictions of expected performance could be a great help for researchers and system administrators. The methodology we propose captures the complexity of adaptive applications by emulating the MPI code while skipping insignificant parts. We demonstrate its capability with High Performance Linpack (HPL), the benchmark used to rank supercomputers in the TOP500 and which requires a careful tuning. We explain (1) how we both extended the SimGrid's SMPI simulator and slightly modified the open-source version of HPL to allow a fast emulation on a single commodity server at the scale of a supercomputer and (2) how to model the different components (network, BLAS, ...) of the system. We show that a careful modeling of both spatial and temporal node variability allows us to obtain predictions within a few percents of real experiments (see Figure 1).
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download

https://hal.inria.fr/hal-02096571
Contributor : Tom Cornebize <>
Submitted on : Thursday, September 26, 2019 - 12:48:02 AM
Last modification on : Tuesday, November 19, 2019 - 7:00:28 PM

Files

paper.pdf
Files produced by the author(s)

Identifiers

Citation

Tom Cornebize, Arnaud Legrand, Franz Heinrich. Fast and Faithful Performance Prediction of MPI Applications: the HPL Case Study. 2019 IEEE International Conference on Cluster Computing (CLUSTER), Sep 2019, Albuquerque, United States. ⟨10.1109/CLUSTER.2019.8891011⟩. ⟨hal-02096571v3⟩

Share

Metrics

Record views

56

Files downloads

395