Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method

Emmanuel Agullo 1 Olivier Aumage 2 Berenger Bramas 3 Olivier Coulaud 1 Samuel Pitoiset 2
1 HiePACS - High-End Parallel Algorithms for Challenging Numerical Simulations
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
2 STORM - STatic Optimizations, Runtime Methods
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : With the advent of complex modern architectures, the low-level paradigms long considered sufficient to build High Performance Computing (HPC) numerical codes have met their limits. Achieving efficiency, ensuring portability, while preserving programming tractability on such hardware prompted the HPC community to design new, higher level paradigms while relying on runtime systems to maintain performance. However, the common weakness of these projects is to deeply tie applications to specific expert-only runtime system APIs. The OpenMP specification, which aims at providing common parallel programming means for shared-memory platforms, appears as a good candidate to address this issue thanks to the latest task-based constructs introduced in its revision 4.0. The goal of this paper is to assess the effectiveness and limits of this support for designing a high-performance numerical library, ScalFMM, implementing the fast multipole method (FMM) that we have deeply redesigned with respect to the most advanced features provided by OpenMP 4. We show that OpenMP 4 allows for significant performance improvements over previous OpenMP revisions on recent multicore processors and that extensions to the 4.0 standard allow for strongly improving the performance, bridging the gap with the very high performance that was so far reserved to expert-only runtime system APIs.
Type de document :
Article dans une revue
IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2017, pp.14. 〈10.1109/TPDS.2017.2697857〉
Liste complète des métadonnées

Littérature citée [27 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01517153
Contributeur : Olivier Aumage <>
Soumis le : mardi 2 mai 2017 - 17:25:19
Dernière modification le : lundi 18 septembre 2017 - 09:52:11
Document(s) archivé(s) le : jeudi 3 août 2017 - 13:46:02

Fichier

tpds_kstar_scalfmm_print.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Relations

  • a pour autre format hal-01372022 - Extended version as research report

Citation

Emmanuel Agullo, Olivier Aumage, Berenger Bramas, Olivier Coulaud, Samuel Pitoiset. Bridging the gap between OpenMP and task-based runtime systems for the fast multipole method. IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2017, pp.14. 〈10.1109/TPDS.2017.2697857〉. 〈hal-01517153〉

Partager

Métriques

Consultations de la notice

206

Téléchargements de fichiers

86