Resource aggregation for task-based Cholesky Factorization on top of modern architectures

Terry Cojean 1, 2 Abdou Guermouche 3, 2, 4 Andra Hugo 5 Raymond Namyst 1, 2, 4 Pierre-André Wacrenier 1, 2, 4
1 STORM - STatic Optimizations, Runtime Methods
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
3 HiePACS - High-End Parallel Algorithms for Challenging Numerical Simulations
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : Hybrid computing platforms are now commonplace, featuring a large number of CPU cores and accelerators. This trend makes balancing computations between these heterogeneous resources performance critical. In this paper we propose ag-gregating several CPU cores in order to execute larger parallel tasks and improve load balancing between CPUs and accelerators. Additionally, we present our approach to exploit internal parallelism within tasks, by combining two runtime system schedulers: a global runtime system to schedule the main task graph and a local one one to cope with internal task parallelism. We demonstrate the relevance of our approach in the context of the dense Cholesky factorization kernel implemented on top of the StarPU task-based runtime system. We present experimental results showing that our solution outperforms state of the art implementations on two architectures: a modern heterogeneous machine and the Intel Xeon Phi Knights Landing.
Type de document :
Pré-publication, Document de travail
This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar .. 2016
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01409965
Contributeur : Terry Cojean <>
Soumis le : mardi 6 décembre 2016 - 11:54:56
Dernière modification le : mardi 17 avril 2018 - 09:08:38
Document(s) archivé(s) le : mardi 21 mars 2017 - 10:40:59

Fichier

submission.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01409965, version 1

Collections

Citation

Terry Cojean, Abdou Guermouche, Andra Hugo, Raymond Namyst, Pierre-André Wacrenier. Resource aggregation for task-based Cholesky Factorization on top of modern architectures. This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar .. 2016. 〈hal-01409965〉

Partager

Métriques

Consultations de la notice

598

Téléchargements de fichiers

152