Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures

Abstract : Whereas most parallel High Performance Computing (HPC) numerical libaries have been written as highly tuned and mostly monolithic codes, the increased complexity of modern architectures led the computational science and engineering community to consider more modular programming paradigms such as task-based paradigms to design new generation of parallel simulation code; this enables to delegate part of the work to a third party software such as a runtime system. That latter approach has been shown to be very productive and efficient with compute-intensive algorithms, such as dense linear algebra and sparse direct solvers. In this study, we consider a much more irregular, and synchronizing algorithm, namely the Conjugate Gradient (CG) algorithm. We propose a task-based formulation of the algorithm together with a very fine instrumentation of the runtime system. We show that almost optimum speed up may be reached on a multi-GPU platform (relatively to the mono-GPU case) and, as a very preliminary but promising result, that the approach can be effectively used to handle heterogenous architectures composed of a multicore chip and multiple GPUs. We expect that these results will pave the way for investigating the design of new advanced, irregular numerical algorithms on top of runtime systems.
Complete list of metadatas

https://hal.inria.fr/hal-01334734
Contributor : Luc Giraud <>
Submitted on : Tuesday, June 21, 2016 - 11:43:29 AM
Last modification on : Tuesday, October 29, 2019 - 7:36:06 AM

Identifiers

Collections

Citation

Emmanuel Agullo, Luc Giraud, Abdou Guermouche, Stojce Nakov, Jean Roman. Task-based Conjugate Gradient: from multi-GPU towards heterogeneous architectures. HeteroPar'2016 worshop of Euro-Par, Aug 2016, Grenoble, France. ⟨10.1007/978-3-319-58943-5⟩. ⟨hal-01334734⟩

Share

Metrics

Record views

358