Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures

Abstract : We study the performance of dense symmetric indefinite factorizations (Bunch-Kaufman and Aasen's algorithms) on multicore CPUs with a Graphics Processing Unit (GPU). Though such algorithms are needed in many scientific and engineering simulations, obtaining high performance of the factorization on the GPU is difficult because the pivoting that is required to ensure the numerical stability of the factorization leads to frequent synchronizations and irregular data accesses. As a result, until recently, there has not been any implementation of these algorithms on hybrid CPU/GPU architectures. To improve their performance on the hybrid architecture, we explore different techniques to reduce the expensive communication and synchronization between the CPU and GPU, or on the GPU. We also study the performance of a symmetric indefinite factorization with no pivoting combined with the preprocessing technique based on Random Butterfly Transformations. Though such transformations only have probabilistic results on the numerical stability, they avoid the pivoting and obtain a great performance on the GPU.
Type de document :
Communication dans un congrès
11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015), Sep 2015, Krakow, Poland. 2015, Lecture Notes in Computer Science. 〈http://ppam.pl〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01223022
Contributeur : Marc Baboulin <>
Soumis le : dimanche 1 novembre 2015 - 15:15:24
Dernière modification le : jeudi 11 janvier 2018 - 06:25:42

Identifiants

  • HAL Id : hal-01223022, version 1

Citation

Marc Baboulin, Jack Dongarra, Adrien Rémy, Stanimire Tomov, Ichitaro Yamazaki. Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures. 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015), Sep 2015, Krakow, Poland. 2015, Lecture Notes in Computer Science. 〈http://ppam.pl〉. 〈hal-01223022〉

Partager

Métriques

Consultations de la notice

122