Skip to Main content Skip to Navigation
Journal articles

1.5D Parallel Sparse Matrix-Vector Multiply

Abstract : There are three common parallel sparse matrix-vector multiply algorithms: 1D 3 row-parallel, 1D column-parallel and 2D row-column-parallel. The 1D parallel algorithms offer the 4 advantage of having only one communication phase. On the other hand, the 2D parallel algorithm 5 is more scalable but it suffers from two communication phases. Here, we introduce a novel concept 6 of heterogeneous messages where a heterogeneous message may contain both input-vector entries 7 and partially computed output-vector entries. This concept not only leads to a decreased number of 8 messages, but also enables fusing the input-and output-communication phases into a single phase. 9 These findings are exploited to propose a 1.5D parallel sparse matrix-vector multiply algorithm 10 which is called local row-column-parallel. This proposed algorithm requires a constrained fine-grain 11 partitioning in which each fine-grain task is assigned to the processor that contains either its input-12 vector entry, or its output-vector entry, or both. We propose two methods to carry out the constrained 13 fine-grain partitioning. We conduct our experiments on a large set of test matrices to evaluate the 14 partitioning qualities and partitioning times of these proposed 1.5D methods. 15 Key words. sparse matrix partitioning, parallel sparse matrix-vector multiplication, directed 16 hypergraph model, bipartite vertex cover, combinatorial scientific computing 17
Complete list of metadatas

Cited literature [22 references]  Display  Hide  Download
Contributor : Equipe Roma <>
Submitted on : Wednesday, October 17, 2018 - 12:38:57 PM
Last modification on : Thursday, August 20, 2020 - 4:40:07 PM
Long-term archiving on: : Friday, January 18, 2019 - 2:12:55 PM


Files produced by the author(s)




Enver Kayaaslan, Cevdet Aykanat, Bora Uçar. 1.5D Parallel Sparse Matrix-Vector Multiply. SIAM Journal on Scientific Computing, Society for Industrial and Applied Mathematics, 2018, 40 (1), pp.C25 - C46. ⟨10.1137/16M1105591⟩. ⟨hal-01897555⟩



Record views


Files downloads