Scaling and optimizing the Gysela code on a cluster of many-core processors

Abstract : The current generation of the Xeon Phi Knights Landing (KNL) processor provides a highly multi-threaded environment on which regular programming models such as MPI/OpenMP can be used. This specific hardware offers both large memory bandwidth and large computing resources and is currently available on computing facilities. Many factors impact the performance achieved by applications, one of the key points is the efficient exploitation of SIMD vector units, another one is the memory access pattern. Thus, vectorization and optimization works have been conducted on a plasma turbulence application, namely Gysela. A set of different techniques have been used: loop splitting, inlining, grouping a set of LU solve operations, removing conditionals and some loop nests, auto-tuning of one computation kernel, changing a key numerical scheme – Lagrange interpolation instead of cubic splines. As a result, KNL execution times have been reduced by up to a factor 3 in some configurations. This effort has also permitted to gain a speedup of 2x on Broadwell architecture and 3x on Skylake. Nice scalability curves up to a few thousands cores have been obtained on a strong scaling experiment. Incremental work for vectorizing the Gysela code meant a large payoff without resorting to writing assembly code or using low-level intrinsics.
Type de document :
Communication dans un congrès
SBAC-PAD 2018, WAMCA workshop, Sep 2018, Lyon, France. SBAC-PAD 2018 proceedings. 〈IEEE〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01719208
Contributeur : Guillaume Latu <>
Soumis le : lundi 1 octobre 2018 - 10:52:26
Dernière modification le : mercredi 3 octobre 2018 - 01:18:33

Fichier

wamca18_gl.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01719208, version 2

Citation

Guillaume Latu, Yuuichi Asahi, Julien Bigot, Tamás Fehér, Virginie Grandgirard. Scaling and optimizing the Gysela code on a cluster of many-core processors. SBAC-PAD 2018, WAMCA workshop, Sep 2018, Lyon, France. SBAC-PAD 2018 proceedings. 〈IEEE〉. 〈hal-01719208v2〉

Partager

Métriques

Consultations de la notice

172

Téléchargements de fichiers

12