Task-based FMM for heterogeneous architectures

Abstract : High performance \FMM is crucial for the numerical simulation of many physical problems. In a previous study~\cite{Agullo2013}, we have shown that task-based \FMM provides the flexibility required to process a wide spectrum of particle distributions efficiently on multicore architectures. In this paper, we now show how such an approach can be extended to fully exploit heterogeneous platforms. For that, we design highly tuned GPU versions of the two dominant operators (P2P and M2L) as well as a scheduling strategy that dynamically decides which proportion of subsequent tasks are processed on regular CPU cores and on GPU accelerators. We assess our method with the StarPU runtime system for executing the resulting task flow on an Intel X5650 Nehalem multicore processor possibly enhanced with one, two or three Nvidia Fermi M2070 or M2090 GPUs. A detailed experimental study on two 30 million particle distributions (a cube and an ellipsoid) shows that the resulting software consistently achieves high performance across architectures.
Complete list of metadatas

Cited literature [42 references]  Display  Hide  Download

https://hal.inria.fr/hal-00974674
Contributor : Olivier Coulaud <>
Submitted on : Monday, April 7, 2014 - 12:21:09 PM
Last modification on : Thursday, January 11, 2018 - 6:22:35 AM
Long-term archiving on : Monday, July 7, 2014 - 11:11:17 AM

File

RR-8513.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00974674, version 1

Relations

Citation

Emmanuel Agullo, Berenger Bramas, Olivier Coulaud, Eric Darve, Matthias Messner, et al.. Task-based FMM for heterogeneous architectures. [Research Report] RR-8513, Inria. 2014, pp.29. ⟨hal-00974674⟩

Share

Metrics

Record views

750

Files downloads

1088