+. Comp, Intel-MPI 2017 Async-Progress Comp + comms, Intel-MPI 2017 Async-Progress-Pin Comp + comms, Open MPI 3

+. Comp, MPC-model-based 0

+. Comp, Intel-MPI 2017 Async-Progress Comp + comms, Intel-MPI 2017 Async-Progress-Pin Comp + comms, Open MPI 3

+. Comp, MPC-model-based References [1] IMB-NBC benchmarks. https://software.intel.com/fr-fr/node, pp.2016-2018, 561946.

G. Almási, P. Heidelberger, C. J. Archer, X. Martorell, C. C. Erway et al., Optimization of MPI collective communication on BlueGene/L systems, Proceedings of the 19th annual international conference on Supercomputing , ICS '05, pp.253-262, 2005.
DOI : 10.1145/1088149.1088183

A. Denis, pioman: A Pthread-Based Multithreaded Communication Engine, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, 2015.
DOI : 10.1109/PDP.2015.78

URL : https://hal.archives-ouvertes.fr/hal-01087775

T. Hoefler and A. Lumsdaine, Message progression in parallel computing - to thread or not to thread?, 2008 IEEE International Conference on Cluster Computing, 2008.
DOI : 10.1109/CLUSTR.2008.4663774

URL : http://www.unixer.de/publications/img/hoefler-ib-threads.pdf

T. Hoefler and A. Lumsdaine, Optimizing non-blocking collective operations for infiniband, 2008 IEEE International Symposium on Parallel and Distributed Processing, 2008.
DOI : 10.1109/IPDPS.2008.4536138

T. Hoefler, A. Lumsdaine, and W. Rehm, Implementation and performance analysis of non-blocking collective operations for MPI, Proceedings of the 2007 ACM/IEEE conference on Supercomputing , SC '07, 2007.
DOI : 10.1145/1362622.1362692

P. Lai, P. Balaji, R. Thakur, and D. Panda, ProOnE: a??general-purpose protocol onload engine for multi- and many-core architectures, Computer Science - Research and Development, vol.23, issue.3-4, 2009.
DOI : 10.1145/1006209.1006251

T. Ma, G. Bosilca, A. Bouteiller, B. Goglin, J. M. Squyres et al., Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs, 2011 International Conference on Parallel Processing, 2011.
DOI : 10.1109/ICPP.2011.29

URL : https://hal.archives-ouvertes.fr/inria-00602877

M. Forum, MPI: A Message-Passing Interface Standard Version 3, 2012.

M. Pérache, H. Jourdren, and R. Namyst, MPC: A Unified Parallel Runtime for Clusters of NUMA Machines, Springer, editor, the 14th International Euro-Par Conference, pp.78-88, 2008.
DOI : 10.1007/978-3-540-85451-7_9

J. Mohammad, A. Rashti, and . Afsahi, Improving communication progress and overlap in mpi rendezvous protocol over rdma-enabled interconnects, High Performance Computing Systems and Applications HPCS 2008. 22nd International Symposium on, pp.95-101, 2008.

M. Si, A. Peña, P. Balaji, M. Takagi, and Y. Ishikawa, MT-MPI, Proceedings of the 28th ACM international conference on Supercomputing, ICS '14, p.2014
DOI : 10.1145/2597652.2597658

S. Sur, H. W. Jin, L. Chai, and D. K. Panda, RDMA read based rendezvous protocol for MPI over InfiniBand, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '06, pp.32-39, 2006.
DOI : 10.1145/1122971.1122978