A 3D Parallel Algorithm for QR Decomposition

Abstract : Interprocessor communication often dominates the runtime of large matrix computations. We present a parallel algorithm for computing QR decompositions whose bandwidth cost (communication volume) can be decreased at the cost of increasing its latency cost (number of messages). By varying a parameter to navigate the bandwidth/latency tradeoff, we can tune this algorithm for machines with different communication costs.
Complete list of metadatas

Contributor : Laura Grigori <>
Submitted on : Wednesday, January 2, 2019 - 4:20:45 PM
Last modification on : Wednesday, August 7, 2019 - 12:19:22 PM


  • HAL Id : hal-01968376, version 1


Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight. A 3D Parallel Algorithm for QR Decomposition. SPAA '18 - 30th ACM Symposium on Parallelism in Algorithms and Architectures, Jul 2018, Vienna, Austria. ⟨hal-01968376⟩



Record views