The International Exascale Software Project roadmap, International Journal of High Performance Computing Applications, vol.25, issue.1, pp.3-60, 2011. ,
DOI : 10.1177/1094342010391989
Quantifying the potential benefit of overlapping communication and computation in largescale scientific applications, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, 2006. ,
Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application, Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pp.17-25, 2010. ,
DOI : 10.1145/1810085.1810092
Prospects for truly asynchronous communication with pure mpi and hybrid mpi/openmp on current supercomputing platforms ,
Improving Communication Progress and Overlap in MPI Rendezvous Protocol over RDMA-enabled Interconnects, 2008 22nd International Symposium on High Performance Computing Systems and Applications, pp.95-101, 2008. ,
DOI : 10.1109/HPCS.2008.10
RDMA read based rendezvous protocol for MPI over InfiniBand, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '06, pp.32-39, 2006. ,
DOI : 10.1145/1122971.1122978
I'll do it later: Softirqs, tasklets, bottom halves, task queues, work queues and timers, Linux.conf.au, 2003. ,
Message progression in parallel computing - to thread or not to thread?, 2008 IEEE International Conference on Cluster Computing, pp.213-222, 2008. ,
DOI : 10.1109/CLUSTR.2008.4663774
pioman: a pthread-based Multithreaded Communication Engine Available: https, Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2015. ,
A scalable and generic task scheduling system for communication libraries, 2009 IEEE International Conference on Cluster Computing and Workshops, 2009. ,
DOI : 10.1109/CLUSTR.2009.5289169
URL : https://hal.archives-ouvertes.fr/inria-00408521
Accurately measuring overhead, communication time and progression of blocking and nonblocking collective operations at massive scale, International Journal of Parallel, Emergent and Distributed Systems, vol.4782, issue.4, pp.241-258, 2010. ,
DOI : 10.1007/3-540-45825-5_43
Intel mpi benchmarks 4.1, " https://software.intel.com/en-us/ articles/intel-mpi-benchmarks ,
COMB: a portable benchmark suite for assessing MPI overlap, Proceedings. IEEE International Conference on Cluster Computing, pp.472-475, 2002. ,
DOI : 10.1109/CLUSTR.2002.1137785
Reproducible measurements of mpi performance characteristics, " in Recent Advances in Parallel Virtual Machine and Message Passing Interface, ser. Lecture Notes in Computer Science, pp.11-18, 1999. ,
A framework for characterizing overlap of communication and computation in parallel applications, Cluster Computing, vol.20, issue.2, pp.75-90, 2008. ,
DOI : 10.1007/s10586-007-0046-3
Open MPI: A Flexible High Performance MPI, The 6th Annual International Conference on Parallel Processing and Applied Mathematics, 2005. ,
DOI : 10.1007/11752578_29
MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.94-103978, 2009. ,
DOI : 10.1007/978-3-642-03770-2_16
The blue waters super-system for super-science, Contemporary High Performance Computing Architectures, 2012. ,
EZTrace: A Generic Framework for Performance Analysis, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2011. ,
DOI : 10.1109/CCGrid.2011.83
URL : https://hal.archives-ouvertes.fr/inria-00587216