D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter et al., The Nas Parallel Benchmarks, International Journal of High Performance Computing Applications, vol.5, issue.3, pp.63-73, 1991.
DOI : 10.1177/109434209100500306

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

D. Buntinas, G. Mercier, and W. Gropp, Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, 2006.
DOI : 10.1007/11846802_19

URL : https://hal.archives-ouvertes.fr/hal-00344339

F. Cappello and D. Etiemble, MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks, ACM/IEEE SC 2000 Conference (SC'00), 2000.
DOI : 10.1109/SC.2000.10001

L. Chai, A. Hartono, and D. K. Panda, Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters, 2006 IEEE International Conference on Cluster Computing, 2006.
DOI : 10.1109/CLUSTR.2006.311850

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra et al., Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation, Proceedings, 11th European PVM/MPI Users' Group Meeting, pp.97-104, 2004.
DOI : 10.1007/978-3-540-30218-6_19

B. Goglin, Design and implementation of Open-MX: High-performance message passing over generic Ethernet hardware, 2008 IEEE International Symposium on Parallel and Distributed Processing, 2008.
DOI : 10.1109/IPDPS.2008.4536140

URL : https://hal.archives-ouvertes.fr/inria-00210704

B. Goglin, Improving message passing over Ethernet with I/OAT copy offload in Open-MX, 2008 IEEE International Conference on Cluster Computing, pp.223-231, 2008.
DOI : 10.1109/CLUSTR.2008.4663775

URL : https://hal.archives-ouvertes.fr/inria-00288757

B. Goglin, L. Prylli, and O. Glück, Optimizations of client's side communications in a distributed file system within a Myrinet cluster, 29th Annual IEEE International Conference on Local Computer Networks, pp.726-733, 2004.
DOI : 10.1109/LCN.2004.92

A. Grover and C. Leech, Accelerating Network Receive Processing (Intel I/O Acceleration Technology), Proceedings of the Linux Symposium10] Intel MPI Benchmarks, pp.281-288, 2005.

I. Myricom, Myrinet Express (MX): A High Performance, Low-Level, Message-Passing Interface for Myrinet, 2006.

H. Tezuka, F. O. Carroll, A. Hori, and Y. Ishikawa, Pin-down cache: a virtual memory management technique for zero-copy communication, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pp.308-315, 1998.
DOI : 10.1109/IPPS.1998.669932

K. Vaidyanathan, L. Chai, W. Huang, and D. K. Panda, Efficient asynchronous memory copy operations on multi-core systems and I/OAT, 2007 IEEE International Conference on Cluster Computing, 2007.
DOI : 10.1109/CLUSTR.2007.4629228

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

K. Vaidyanathan, W. Huang, L. Chai, and D. K. Panda, Designing Efficient Asynchronous Memory Operations Using Hardware Copy Engine: A Case Study with I/OAT, 2007 IEEE International Parallel and Distributed Processing Symposium, p.234, 2007.
DOI : 10.1109/IPDPS.2007.370479