Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques.

Emmanuel Jeannot 1, 2 Guillaume Mercier 1, 2 François Tessier 2, 1
2 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, CNRS - Centre National de la Recherche Scientifique : UMR5800, UB - Université de Bordeaux
Abstract : Current generations of NUMA node clusters feature multicore or manycore processors. Programming such architectures efficiently is a challenge because numerous hardware characteristics have to be taken into account, especially the memory hierarchy. One appealing idea to improve the performance of parallel applications is to decrease their communication costs by matching the communication pattern to the underlying hardware architecture. In this paper, we detail the algorithm and techniques proposed to achieve such a result: first, we gather both the communication pattern information and the hardware details. Then we compute a relevant reordering of the various process ranks of the application. Finally, those new ranks are used to reduce the communication costs of the application.
Type de document :
Article dans une revue
IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2014, 25 (4), pp.993- 1002. <10.1109/TPDS.2013.104>


https://hal.inria.fr/hal-01109978
Contributeur : Emmanuel Jeannot <>
Soumis le : mardi 27 janvier 2015 - 11:46:30
Dernière modification le : mercredi 9 septembre 2015 - 16:36:19

Identifiants

Collections

Citation

Emmanuel Jeannot, Guillaume Mercier, François Tessier. Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques.. IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2014, 25 (4), pp.993- 1002. <10.1109/TPDS.2013.104>. <hal-01109978>

Partager

Métriques

Consultations de la notice

173