Skip to Main content Skip to Navigation
Conference papers

Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines

Emmanuel Jeannot 1, 2 
1 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
Abstract : We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread placement and data placement in order to achieve performance gain up to 50% compared to state-of-the-art libraries such as Plasma or MKL.
Complete list of metadata

Cited literature [10 references]  Display  Hide  Download

https://hal.inria.fr/hal-00772790
Contributor : Emmanuel Jeannot Connect in order to contact the contributor
Submitted on : Friday, January 11, 2013 - 10:37:21 AM
Last modification on : Saturday, June 25, 2022 - 7:46:20 PM
Long-term archiving on: : Saturday, April 1, 2017 - 3:46:31 AM

File

jeannot.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00772790, version 1

Collections

Citation

Emmanuel Jeannot. Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines. PAAP 2012 - IEEE International Symposium on Parallel Architectures, Algorithms and Programming, Dec 2012, Taipei, Taiwan. ⟨hal-00772790⟩

Share

Metrics

Record views

95

Files downloads

144