Skip to Main content Skip to Navigation
Conference papers

High Throughput Intra-Node MPI Communication with Open-MX

Brice Goglin 1, 2
2 RUNTIME - Efficient runtime systems for parallel architectures
CNRS - Centre National de la Recherche Scientifique : UMR5800, UB - Université de Bordeaux, Inria Bordeaux - Sud-Ouest
Abstract : The increasing number of cores per node in high-performance computing requires an efficient intra-node MPI communication subsystem. Most existing MPI implementations rely on two copies across a shared memory-mapped file. Open-MX offers a single-copy mechanism that is tightly integrated in its regular communication stack, making it transparently available to the MX backend of many MPI layers. We describe this implementation and its offloaded copy backend using I/OAT hardware. Memory pinning requirements are then discussed, and overlapped pinning is introduced to enable the start of Open-MX intra-node data transfer earlier. Performance evaluation shows that this local communication stack performs better than MPICH2 and Open~MPI for large messages, reaching up to 70\,\% better throughput in micro-benchmarks when using I/OAT copy offload. Thanks to a single-copy being involved, the Open-MX intra-node communication throughput also does not heavily depend on cache sharing between processing cores, making these performance improvements easier to observe in real applications.
Complete list of metadata

Cited literature [13 references]  Display  Hide  Download
Contributor : Brice Goglin Connect in order to contact the contributor
Submitted on : Wednesday, October 15, 2008 - 5:10:41 PM
Last modification on : Monday, December 20, 2021 - 4:50:10 PM
Long-term archiving on: : Tuesday, October 9, 2012 - 1:50:15 PM


Files produced by the author(s)



Brice Goglin. High Throughput Intra-Node MPI Communication with Open-MX. 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2009), Feb 2009, Weimar, Germany. ⟨10.1109/PDP.2009.20⟩. ⟨inria-00331209⟩



Les métriques sont temporairement indisponibles