Skip to Main content Skip to Navigation
New interface
Reports (Research report)

Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures

Abstract : Whereas most parallel High Performance Computing (HPC) numerical libaries have been written as highly tuned and mostly monolithic codes, the increased complexity of modern architectures led the computational science and engineering community to consider more mod- ular programming paradigms such as task-based paradigms to design new generation of parallel simulation code; this enables to delegate part of the work to a third party software such as a runtime system. That latter approach has been shown to be very productive and efficient with compute-intensive algorithms, such as dense linear algebra and sparse direct solvers. In this study, we consider a much more irregular, and synchronizing algorithm, namely the Conjugate Gradient (CG) algorithm. We propose a task-based formulation of the algorithm together with a very fine instrumentation of the runtime system. We show that almost optimum speed up may be reached on a multi-GPU platform (relatively to the mono-GPU case) and, as a very preliminary but promising result, that the approach can be effectively used to handle heterogenous architectures composed of a multicore chip and multiple GPUs. We expect that these results will pave the way for investigating the design of new advanced, irregular numerical algorithms on top of runtime systems.
Document type :
Reports (Research report)
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Luc Giraud Connect in order to contact the contributor
Submitted on : Thursday, December 15, 2016 - 11:33:58 AM
Last modification on : Wednesday, October 26, 2022 - 8:14:48 AM
Long-term archiving on: : Thursday, March 16, 2017 - 5:21:26 PM


Files produced by the author(s)


  • HAL Id : hal-01316982, version 2



E Agullo, L Giraud, A Guermouche, S Nakov, Jean Roman. Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures. [Research Report] RR-8912, Inria. 2016. ⟨hal-01316982v2⟩



Record views


Files downloads