HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Interferences between Communications and Computations in Distributed HPC Systems

Alexandre Denis 1 Emmanuel Jeannot 1 Philippe Swartvagher 1
1 TADAAM - Topology-Aware System-Scale Data Management for High-Performance Computing
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : Parallel runtime systems such as MPI or task-based libraries provide models to manage both computation and communication by allocating cores, scheduling threads, executing communication algorithms. Efficiently implementing such models is challenging due to their interplay within the runtime system. In this paper, we assess interferences between communications and computations when they run side by side. We study the impact of communications on computations, and conversely the impact of computations on communication performance. We consider two aspects: CPU frequency, and memory contention. We have designed benchmarks to measure these phenomena. We show that CPU frequency variations caused by computation have a small impact on communication latency and bandwidth. However, we have observed on Intel, AMD and ARM processors, that memory contention may cause a severe slowdown of computation and communication when they occur at the same time. We have designed a benchmark with a tunable arithmetic intensity that shows how interferences between communication and computation actually depend on memory pressure of the application. Finally we have observed up to 90 % performance loss on communications with common HPC kernels such as CG and GEMM.
Complete list of metadata

https://hal.inria.fr/hal-03290121
Contributor : Philippe Swartvagher Connect in order to contact the contributor
Submitted on : Monday, July 19, 2021 - 10:53:46 AM
Last modification on : Friday, May 6, 2022 - 3:42:46 AM
Long-term archiving on: : Wednesday, October 20, 2021 - 6:20:31 PM

File

rr.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Alexandre Denis, Emmanuel Jeannot, Philippe Swartvagher. Interferences between Communications and Computations in Distributed HPC Systems. ICPP 2021 - 50th International Conference on Parallel Processing, Aug 2021, Chicago / Virtual, United States. pp.11, ⟨10.1145/3472456.3473516⟩. ⟨hal-03290121⟩

Share

Metrics

Record views

202

Files downloads

128