Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Communication-Aware Load Balancing of the LU Factorization over Heterogeneous Clusters

Abstract : Large clusters and supercomputers are rapidly evolving and may be subject to regular hardware updates that increase the chances of becoming heterogeneous. Homogeneous clusters may also have variable performance capabilities due to processor manufacturing, or even partitions equipped with different types of accelerators. Data distribution over heterogeneous nodes is very challenging but essential to exploit all resources efficiently. In this article, we build upon task-based runtimes' flexibility to study the interplay between static communication-aware data distribution strategies and dynamic scheduling of the linear algebra LU factorization over heterogeneous sets of hybrid nodes. We propose two techniques derived from the state-of-the-art 1D×1D data distributions. First, to use fewer computing nodes towards the end to better match performance bounds and save computing power. Second, to carefully move a few blocks between nodes to optimize even further the load balancing among nodes. We also demonstrate how 1D×1D data distributions, tailored for heterogeneous nodes, can scale better with homogeneous clusters than classical block-cyclic distributions. Validation is carried out both in real and in simulated environments under homogeneous and heterogeneous platforms, demonstrating compelling performance improvements.
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.inria.fr/hal-02633985
Contributor : Arnaud Legrand <>
Submitted on : Wednesday, May 27, 2020 - 2:04:39 PM
Last modification on : Thursday, August 27, 2020 - 3:03:08 AM

File

pap222s1.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02633985, version 1

Citation

Lucas Nesi, Lucas Mello Schnorr, Arnaud Legrand. Communication-Aware Load Balancing of the LU Factorization over Heterogeneous Clusters. 2020. ⟨hal-02633985⟩

Share

Metrics

Record views

71

Files downloads

349