On the Overhead of Topology Discovery for Locality-aware Scheduling in HPC

Brice Goglin 1
1 TADAAM - Topology-Aware System-Scale Data Management for High-Performance Computing
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : The increasing complexity of parallel computing platforms requires a deep knowledge of the hardware and of the application needs. Locality a key criteria for performance optimization. It involves software tools to expose information about the hardware topology to high performance runtime libraries. We show that the overhead of gathering such information from the operating system is significant on large computing nodes that run Linux. This overhead also increases more than linearly with the number of processes that perform it simultaneously. We then study the actual needs of the HPC software ecosystem in terms of topology information. We propose some ways to avoid multiple expensive topology discovery and to share topology information between components such as the resource manager or the runtime libraries.
Complete list of metadatas

Cited literature [24 references]  Display  Hide  Download

https://hal.inria.fr/hal-01402755
Contributor : Brice Goglin <>
Submitted on : Thursday, July 13, 2017 - 7:49:43 PM
Last modification on : Wednesday, May 15, 2019 - 5:24:04 PM
Long-term archiving on : Friday, January 26, 2018 - 7:18:40 PM

File

article.pdf
Files produced by the author(s)

Identifiers

Citation

Brice Goglin. On the Overhead of Topology Discovery for Locality-aware Scheduling in HPC. PDP2017 - 25th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Mar 2017, St Petersburg, Russia. pp.9, ⟨10.1109/PDP.2017.35⟩. ⟨hal-01402755v3⟩

Share

Metrics

Record views

383

Files downloads

296