Implementing a Systolic Algorithm for QR Factorization on Multicore Clusters with PaRSEC

Abstract : This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for inter-node communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-the-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures.
Complete list of metadatas

Cited literature [23 references]  Display  Hide  Download

https://hal.inria.fr/hal-00879248
Contributor : Guillaume Pallez (aupy) <>
Submitted on : Saturday, November 2, 2013 - 11:20:28 AM
Last modification on : Thursday, November 21, 2019 - 2:02:31 AM
Long-term archiving on: Monday, February 3, 2014 - 4:24:22 AM

File

RR-8390.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00879248, version 1

Collections

Citation

Guillaume Aupy, Mathieu Faverge, Yves Robert, Jakub Kurzak, Piotr Luszczek, et al.. Implementing a Systolic Algorithm for QR Factorization on Multicore Clusters with PaRSEC. [Research Report] RR-8390, INRIA. 2013, pp.16. ⟨hal-00879248⟩

Share

Metrics

Record views

470

Files downloads

353