inria-00547616, version 1
Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators.
Emmanuel Agullo
1, 2Cédric Augonnet
2, 3Jack Dongarra
4, 5, 6Hatem Ltaief
4Raymond Namyst 2, 3Jean Roman
1, 2Samuel Thibault
2, 3Stanimire Tomov
4
Symposium on Application Accelerators in High Performance Computing (SAAHPC) (2010)
Résumé : Although the hardware has dramatically changed in the last few years, nodes of multicore chips augmented by Graphics Processing Units (GPUs) seem to be a trend of major importance. Previous approaches for scheduling dense linear operations on such a complex node led to high performance but at the double cost of not using the potential of all the cores and producing a static and non generic code. In this extended abstract, we present a new approach for scheduling dense linear algebra operations on multicore architectures with GPU accelerators using a dynamic scheduler capable of using the full potential of the node [1]. We underline the benefits both in terms of programmability and performance. We illustrate our approach with a Cholesky factorization relying on cutting edge GPU and CPU kernels [2], [3] achieving roughly 900 Gflop/s on an eight cores node accelerated with three NVIDIA Tesla GPUs.
- 1 : HiePACS (INRIA Bordeaux - Sud-Ouest)
- INRIA – Université de Bordeaux – CNRS : UMR5800 – CERFACS
- 2 : Laboratoire Bordelais de Recherche en Informatique (LaBRI)
- CNRS : UMR5800 – Université Sciences et Technologies - Bordeaux I – École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB) – Université Victor Segalen - Bordeaux II
- 3 : RUNTIME (INRIA Bordeaux - Sud-Ouest)
- INRIA – CNRS : UMR5800 – Université Sciences et Technologies - Bordeaux I – École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)
- 4 : Department of Computer Science. University of Tennessee
- Tennessee State University
- 5 : Oak Ridge National Laboratory (ORNL)
- US Department of Energy – UI-Battelle
- 6 : School of Computer Science [Manchester]
- University of Manchester
- Domaine : Informatique/Calcul parallèle, distribué et partagé
- inria-00547616, version 1
- http://hal.inria.fr/inria-00547616
- oai:hal.inria.fr:inria-00547616
- Contributeur : Samuel Thibault
- Soumis le : Jeudi 16 Décembre 2010, 19:09:19
- Dernière modification le : Mercredi 22 Décembre 2010, 23:11:58






Documents associés
Exporter