Trading Performance for Memory in Sparse Direct Solvers using Low-rank Compression - Archive ouverte HAL Access content directly
Journal Articles Future Generation Computer Systems Year : 2022

Trading Performance for Memory in Sparse Direct Solvers using Low-rank Compression

(1, 2) , (1, 2, 3) , (1, 2) , (1, 2)
1
2
3

Abstract

Sparse direct solvers using Block Low-Rank compression have been proven efficient to solve problems arising in many real-life applications. Improving those solvers is crucial for being able to 1) solve larger problems and 2) speed up computations. A main characteristic of a sparse direct solver using low-rank compression is at what point in the algorithm the compression is performed. There are two distinct approaches: (1) all blocks are compressed before starting the factorization, which reduces the memory as much as possible, or (2) each block is compressed as late as possible, which usually leads to better speedup. Approach 1 reaches a very small memory footprint generally at the expense of a greater execution time. Approach 2 achieves a smaller execution time but requires more memory. The objective of this paper is to design a composite approach, to speedup computations while staying under a given memory limit. This should allow to solve large problems that cannot be solved with Approach 2 while reducing the execution time compared to Approach 1. We propose a memory-aware strategy where each block can be compressed either at the beginning or as late as possible. We first consider the problem of choosing when to compress each block, under the assumption that all information on blocks is perfectly known, i.e., memory requirement and execution time of a block when compressed or not. We show that this problem is a variant of the NP-complete Knapsack problem, and adapt an existing approximation algorithm for our problem. Unfortunately, the required information on blocks depends on numerical properties and in practice cannot be known in advance. We thus introduce models to estimate those values. Experiments on the PaStiX solver demonstrate that our new approach can achieve an excellent trade-off between memory consumption and computational cost. For instance on matrix Geo1438, Approach 2 uses three times as much memory as Approach 1 while being three times faster. Our new approach leads to an execution time only 30% larger than Approach 2 when given a memory 30% larger than the one needed by Approach 1.
Fichier principal
Vignette du fichier
paper.pdf (1.1 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03517124 , version 1 (07-01-2022)

Identifiers

Cite

Loris Marchal, Thibault Marette, Grégoire Pichon, Frédéric Vivien. Trading Performance for Memory in Sparse Direct Solvers using Low-rank Compression. Future Generation Computer Systems, 2022, 130, pp.307-320. ⟨10.1016/j.future.2021.12.018⟩. ⟨hal-03517124⟩
75 View
75 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More