Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures - Archive ouverte HAL Access content directly
Reports (Research Report) Year : 2016

Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures

(1) , (1) , (1, 2) , (3, 1) , (1)
1
2
3

Abstract

Whereas most parallel High Performance Computing (HPC) numerical libaries have been written as highly tuned and mostly monolithic codes, the increased complexity of modern architectures led the computational science and engineering community to consider more mod- ular programming paradigms such as task-based paradigms to design new generation of parallel simulation code; this enables to delegate part of the work to a third party software such as a runtime system. That latter approach has been shown to be very productive and efficient with compute-intensive algorithms, such as dense linear algebra and sparse direct solvers. In this study, we consider a much more irregular, and synchronizing algorithm, namely the Conjugate Gradient (CG) algorithm. We propose a task-based formulation of the algorithm together with a very fine instrumentation of the runtime system. We show that almost optimum speed up may be reached on a multi-GPU platform (relatively to the mono-GPU case) and, as a very preliminary but promising result, that the approach can be effectively used to handle heterogenous architectures composed of a multicore chip and multiple GPUs. We expect that these results will pave the way for investigating the design of new advanced, irregular numerical algorithms on top of runtime systems.
Fichier principal
Vignette du fichier
RR-8912.pdf (2.07 Mo) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01316982 , version 1 (17-05-2016)
hal-01316982 , version 2 (15-12-2016)

Identifiers

  • HAL Id : hal-01316982 , version 2

Cite

E Agullo, L Giraud, A Guermouche, S Nakov, Jean Roman. Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures. [Research Report] RR-8912, Inria. 2016. ⟨hal-01316982v2⟩
228 View
335 Download

Share

Gmail Facebook Twitter LinkedIn More