Structuring the execution of OpenMP applications for multicore architectures

François Broquedis 1, 2 Olivier Aumage 1, 2 Brice Goglin 1, 2 Samuel Thibault 1, 2 Pierre-André Wacrenier 1, 2 Raymond Namyst 1, 2
2 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
Abstract : The now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user friendliness of shared memory on the one side, and memory access scalability and efficiency on the other side. However, to get high performance out of such machines requires a dynamic mapping of application tasks and data onto the underlying architecture. Moreover, depending on the application behavior, this mapping should favor cache affinity, memory bandwidth, computation synchrony, or a combination of these. The great challenge is then to perform this hardware-dependent mapping in a portable, abstract way. To meet this need, we propose a new, hierarchical approach to the execution of OpenMP threads onto multicore machines. Our ForestGOMP runtime system dynamically generates structured trees out of OpenMP programs. It collects relationship information about threads and data as well. This information is used together with scheduling hints and hardware counter feedback by the scheduler to select the most appropriate threads and data distribution. ForestGOMP features a high-level platform for developing and tuning portable threads schedulers. We present several applications for which we developed specific scheduling policies that achieve excellent speedups on 16-core machines.
Type de document :
Communication dans un congrès
IEEE. International Parallel and Distributed Symposium (IPDPS 2010), Apr 2010, Atltanta, United States. 2010, <10.1109/IPDPS.2010.5470442>
Liste complète des métadonnées


https://hal.inria.fr/inria-00441472
Contributeur : Brice Goglin <>
Soumis le : vendredi 29 janvier 2010 - 15:29:08
Dernière modification le : jeudi 10 septembre 2015 - 01:06:37
Document(s) archivé(s) le : mercredi 30 novembre 2016 - 11:51:28

Fichier

PID1125911.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

François Broquedis, Olivier Aumage, Brice Goglin, Samuel Thibault, Pierre-André Wacrenier, et al.. Structuring the execution of OpenMP applications for multicore architectures. IEEE. International Parallel and Distributed Symposium (IPDPS 2010), Apr 2010, Atltanta, United States. 2010, <10.1109/IPDPS.2010.5470442>. <inria-00441472>

Partager

Métriques

Consultations de
la notice

343

Téléchargements du document

290