Generic algorithmic scheme for 2D stencil applications on hybrid machines

Abstract : Hardware accelerators are classic scientific coprocessors in HPC machines. However, the number of CPU cores on the mother board is increasing and constitutes a non negligible part of the total computing power of the machine. So, running an application both on an accelerator (like a GPU or a Xeon-Phi device) and on the CPU cores can provide the highest performance. Moreover, it is now possible to include different accelerators in a machine, in order to support and to speedup a larger set of applications. Then, running an application part on the most suitable device allows to reach high performance, but using all unused devices in the machine should permit to improve even more the performance of that part. However, the overlapping of computations with inter-device data transfers is mandatory to limit the overhead of this approach, leading to complex asynchronous algorithms and multi-paradigm optimized codes. This article introduces our research and experiments on cooperation between several CPU and both a GPU and a Xeon-Phi accelerators, all included in a same machine.
Type de document :
Communication dans un congrès
ARCS 2016 - Architecture of Computing Systems , Apr 2016, Nuremberg, Germany. LNCS, 〈https://www3.cs.fau.de/arcs2016〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01263242
Contributeur : Sylvain Contassot-Vivier <>
Soumis le : mercredi 27 janvier 2016 - 15:38:25
Dernière modification le : mardi 24 avril 2018 - 13:38:11

Identifiants

  • HAL Id : hal-01263242, version 1

Citation

Stéphane Vialle, Sylvain Contassot-Vivier, Patrick Mercier. Generic algorithmic scheme for 2D stencil applications on hybrid machines. ARCS 2016 - Architecture of Computing Systems , Apr 2016, Nuremberg, Germany. LNCS, 〈https://www3.cs.fau.de/arcs2016〉. 〈hal-01263242〉

Partager

Métriques

Consultations de la notice

455