Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective

Bertrand Putigny 1, 2 Benoit Ruelle 1 Brice Goglin 1, 2
1 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
Abstract : Shared memory MPI communication is an important part of the overall performance of parallel applications. However understanding the behavior of these data transfers is difficult because of the combined complexity of modern memory architectures with multiple levels of caches and complex cache coherence protocols, of MPI implementations, and of application needs. We analyze shared memory MPI communication from a cache coherence perspective through a new memory model. It captures the memory architecture characteristics with microbenchmarks that exhibit the limitations of the memory accesses involved in the data transfer. We model the performance of intra-node communication without requiring complex analytical models. The advantage of the approach consists in not requiring deep knowledge of rarely documented hardware features such as caching policies or prefetchers that make modeling modern memory subsystems hardly feasible. Our qualitative analysis based on this result leads to a better understanding of shared memory communication performance for scientific computing. We then discuss some possible optimizations such as buffer reuse order, cache flushing, and non-temporal instructions that could be used by MPI implementers.
Type de document :
Communication dans un congrès
PDSEC - The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with IPDPS, May 2014, Phoenix, AZ, United States. IEEE, 2014
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00956307
Contributeur : Brice Goglin <>
Soumis le : jeudi 6 mars 2014 - 11:21:32
Dernière modification le : jeudi 11 janvier 2018 - 06:22:12
Document(s) archivé(s) le : vendredi 6 juin 2014 - 10:52:12

Fichier

article.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00956307, version 1

Collections

Citation

Bertrand Putigny, Benoit Ruelle, Brice Goglin. Analysis of MPI Shared-Memory Communication Performance from a Cache Coherence Perspective. PDSEC - The 15th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with IPDPS, May 2014, Phoenix, AZ, United States. IEEE, 2014. 〈hal-00956307〉

Partager

Métriques

Consultations de la notice

391

Téléchargements de fichiers

534