Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC

Abstract : This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for inter-node communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-the-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures.
Type de document :
Communication dans un congrès
PROPER 2013 - 6th Workshop on Productivity and Performance, Aug 2013, Aachen, Germany. 2013
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00844492
Contributeur : Mathieu Faverge <>
Soumis le : lundi 2 décembre 2013 - 10:02:26
Dernière modification le : jeudi 8 février 2018 - 11:10:04
Document(s) archivé(s) le : lundi 3 mars 2014 - 13:55:48

Fichier

submitted.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00844492, version 1

Citation

Guillaume Aupy, Mathieu Faverge, Yves Robert, Jakub Kurzak, Piotr Luszczek, et al.. Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC. PROPER 2013 - 6th Workshop on Productivity and Performance, Aug 2013, Aachen, Germany. 2013. 〈hal-00844492〉

Partager

Métriques

Consultations de la notice

519

Téléchargements de fichiers

115