J. Dongarra, The International Exascale Software Project roadmap, International Journal of High Performance Computing Applications, vol.25, issue.1, pp.3-60, 2011.
DOI : 10.1177/1094342010391989

J. Sancho, K. Barker, D. Kerbyson, and K. Davis, Quantifying the potential benefit of overlapping communication and computation in largescale scientific applications, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, 2006.

S. Potluri, P. Lai, K. Tomko, S. Sur, Y. Cui et al., Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application, Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pp.17-25, 2010.
DOI : 10.1145/1810085.1810092

G. Hager, G. Schubert, T. Schoenemeyer, and G. Wellein, Prospects for truly asynchronous communication with pure mpi and hybrid mpi/openmp on current supercomputing platforms

M. J. Rashti and A. Afsahi, Improving Communication Progress and Overlap in MPI Rendezvous Protocol over RDMA-enabled Interconnects, 2008 22nd International Symposium on High Performance Computing Systems and Applications, pp.95-101, 2008.
DOI : 10.1109/HPCS.2008.10

S. Sur, H. Jin, L. Chai, and D. Panda, RDMA read based rendezvous protocol for MPI over InfiniBand, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '06, pp.32-39, 2006.
DOI : 10.1145/1122971.1122978

M. Wilcox, I'll do it later: Softirqs, tasklets, bottom halves, task queues, work queues and timers, Linux.conf.au, 2003.

T. Hoefler and A. Lumsdaine, Message progression in parallel computing - to thread or not to thread?, 2008 IEEE International Conference on Cluster Computing, pp.213-222, 2008.
DOI : 10.1109/CLUSTR.2008.4663774

A. Denis, pioman: a pthread-based Multithreaded Communication Engine Available: https, Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2015.

F. Trahay and A. Denis, A scalable and generic task scheduling system for communication libraries, 2009 IEEE International Conference on Cluster Computing and Workshops, 2009.
DOI : 10.1109/CLUSTR.2009.5289169
URL : https://hal.archives-ouvertes.fr/inria-00408521

T. Hoefler, T. Schneider, and A. Lumsdaine, Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive scale, International Journal of Parallel, Emergent and Distributed Systems, vol.4782, issue.4, pp.241-258, 2010.
DOI : 10.1007/3-540-45825-5_43

. Intel, Intel mpi benchmarks 4.1, " https://software.intel.com/en-us/ articles/intel-mpi-benchmarks

W. Lawry, C. Wilson, A. Maccabe, and R. Brightwell, COMB: a portable benchmark suite for assessing MPI overlap, Proceedings. IEEE International Conference on Cluster Computing, pp.472-475, 2002.
DOI : 10.1109/CLUSTR.2002.1137785

W. Gropp and E. Lusk, Reproducible measurements of mpi performance characteristics, " in Recent Advances in Parallel Virtual Machine and Message Passing Interface, ser. Lecture Notes in Computer Science, pp.11-18, 1999.

A. Shet, P. Sadayappan, D. Bernholdt, J. Nieplocha, and V. Tipparaju, A framework for characterizing overlap of communication and computation in parallel applications, Cluster Computing, vol.20, issue.2, pp.75-90, 2008.
DOI : 10.1007/s10586-007-0046-3

R. L. Graham, T. S. Woodall, and J. M. Squyres, Open MPI: A Flexible High Performance MPI, The 6th Annual International Conference on Parallel Processing and Applied Mathematics, 2005.
DOI : 10.1007/11752578_29

M. Pérache, P. Carribault, and H. Jourdren, MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.94-103978, 2009.
DOI : 10.1007/978-3-642-03770-2_16

B. Bode, M. Butler, T. Dunning, W. Gropp, T. Hoefler et al., The blue waters super-system for super-science, Contemporary High Performance Computing Architectures, 2012.

F. Trahay, F. Rue, M. Faverge, Y. Ishikawa, R. Namyst et al., EZTrace: A Generic Framework for Performance Analysis, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2011.
DOI : 10.1109/CCGrid.2011.83
URL : https://hal.archives-ouvertes.fr/inria-00587216