MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks, ACM/IEEE SC 2000 Conference (SC'00), p.12, 2000. ,
DOI : 10.1109/SC.2000.10001
Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem, Parallel Computing, vol.33, issue.9, pp.634-644, 2006. ,
DOI : 10.1016/j.parco.2007.06.003
URL : https://hal.archives-ouvertes.fr/hal-00344327
Designing an Efficient Kernel-Level and User-Level Hybrid Approach for MPI Intra-Node Communication on Multi-Core Systems, 2008 37th International Conference on Parallel Processing, 2008. ,
DOI : 10.1109/ICPP.2008.16
Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis, 2009 International Conference on Parallel Processing, 2009. ,
DOI : 10.1109/ICPP.2009.22
URL : https://hal.archives-ouvertes.fr/inria-00390064
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation, Proceedings, 11th European PVM/MPI Users' Group Meeting, pp.97-104, 2004. ,
DOI : 10.1007/978-3-540-30218-6_19
Data Transfers between Processes in an SMP System: Performance Study and Application to MPI, 2006 International Conference on Parallel Processing (ICPP'06), pp.487-496, 2006. ,
DOI : 10.1109/ICPP.2006.31
High Throughput Intra-Node MPI Communication with Open-MX, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2009. ,
DOI : 10.1109/PDP.2009.20
URL : https://hal.archives-ouvertes.fr/inria-00331209
SMARTMAP: Operating system support for efficient data sharing among processes on a multi-core processor, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, 2008. ,
DOI : 10.1109/SC.2008.5218881
Performance Analysis and Evaluation of PCIe 2.0 and Quad-Data Rate InfiniBand, 2008 16th IEEE Symposium on High Performance Interconnects, 2008. ,
DOI : 10.1109/HOTI.2008.26
Lightweight kernel-level primitives for high-performance MPI intra-node communication over multi-core systems, 2007 IEEE International Conference on Cluster Computing, 2007. ,
DOI : 10.1109/CLUSTR.2007.4629263
Accelerating Network Receive Processing (Intel I/O Acceleration Technology), Proceedings of the Linux Symposium, pp.281-288, 2005. ,
Efficient asynchronous memory copy operations on multi-core systems and I/OAT, 2007 IEEE International Conference on Cluster Computing, 2007. ,
DOI : 10.1109/CLUSTR.2007.4629228
The Nas Parallel Benchmarks, International Journal of High Performance Computing Applications, vol.5, issue.3, pp.63-73, 1991. ,
DOI : 10.1177/109434209100500306
Implementation and performance analysis of non-blocking collective operations for MPI, Proceedings of the 2007 ACM/IEEE conference on Supercomputing , SC '07, 2007. ,
DOI : 10.1145/1362622.1362692