Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters - Archive ouverte HAL Access content directly
Journal Articles Journal of Parallel and Distributed Computing Year : 2022

Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters

(1, 2) , (3, 1)
1
2
3

Abstract

Finely tuning MPI applications and understanding the influence of key parameters (number of processes, granularity, collective operation algorithms, virtual topology, and process placement) is critical to obtain good performance on supercomputers. With the high consumption of running applications at scale, doing so solely to optimize their performance is particularly costly. Having inexpensive but faithful predictions of expected performance could be a great help for researchers and system administrators. The methodology we propose decouples the complexity of the platform, which is captured through statistical models of the performance of its main components (MPI communications, BLAS operations), from the complexity of adaptive applications by emulating the application and skipping regular non-MPI parts of the code. We demonstrate the capability of our method with High-Performance Linpack (HPL), the benchmark used to rank supercomputers in the TOP500, which requires careful tuning. We briefly present (1) how the open-source version of HPL can be slightly modified to allow a fast emulation on a single commodity server at the scale of a supercomputer. Then we present (2) an extensive (in)validation study that compares simulation with real experiments and demonstrates our ability to predict the performance of HPL within a few percent consistently. This study allows us to identify the main modeling pitfalls (e.g., spatial and temporal node variability or network heterogeneity and irregular behavior) that need to be considered. Last, we show (3) how our ``surrogate'' allows studying several subtle HPL parameter optimization problems while accounting for uncertainty on the platform.
Fichier principal
Vignette du fichier
paper.pdf (3.77 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03141988 , version 1 (15-02-2021)
hal-03141988 , version 2 (06-01-2022)

Identifiers

Cite

Tom Cornebize, Arnaud Legrand. Simulation-based Optimization and Sensibility Analysis of MPI Applications: Variability Matters. Journal of Parallel and Distributed Computing, 2022, ⟨10.1016/j.jpdc.2022.04.002⟩. ⟨hal-03141988v2⟩
182 View
136 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More