, MadMPI/NewMadeleine OpenMPI, vol.1, issue.10

M. Intel, , 2017.

W. Schonbein, M. G. Dosanjh, R. E. Grant, and P. G. Bridges, Measuring multithreaded message matching misery, Euro-Par 2018: Parallel Processing, pp.480-491, 2018.
DOI : 10.1007/978-3-319-96983-1_34

J. Dongarra, M. Abalenkovs, A. Abdelfattah, M. Gates, A. Haidar et al., Parallel programming models for dense linear algebra on heterogeneous systems, Supercomputing Frontiers and Innovations, vol.2, issue.4, 2016.

E. Agullo, O. Aumage, M. Faverge, N. Furmento, F. Pruvost et al., Achieving High Performance on Supercomputers with a Sequential Task-based Programming Model, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01618526

O. Aumage, E. Brunet, N. Furmento, and R. Namyst, NewMadeleine: a Fast Communication Scheduling Engine for High Performance Networks, Workshop on Communication Architecture for Clusters (CAC 2007), workshop held in conjunction with IPDPS, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00127356

R. Brightwell, S. Goudy, and K. Underwood, A preliminary analysis of the mpi queue characterisitics of several applications, 2005 International Conference on Parallel Processing (ICPP'05), pp.175-183, 2005.

R. Keller and R. L. Graham, Characteristics of the unexpected message queue of mpi applications, Recent Advances in the Message Passing Interface, pp.179-188, 2010.

G. Dózsa, S. Kumar, P. Balaji, D. Buntinas, D. Goodell et al., Enabling concurrent multithreaded mpi communication on multicore petascale systems, Recent Advances in the Message Passing Interface, pp.11-20, 2010.

J. A. Zounmevo and A. Afsahi, An efficient mpi message queue mechanism for large-scale jobs, 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp.464-471, 2012.
DOI : 10.1109/icpads.2012.70

M. Flajslik, J. Dinan, and K. D. Underwood, Mitigating mpi message matching misery, pp.281-299, 2016.
DOI : 10.1007/978-3-319-41321-1_15

M. Bayatpour, H. Subramoni, S. Chakraborty, and D. K. Panda, Adaptive and dynamic design for mpi tag matching, 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp.1-10, 2016.
DOI : 10.1109/cluster.2016.69

T. Hoefler, G. Bronevetsky, B. Barrett, B. R. De-supinski, and A. Lumsdaine, Efficient mpi support for advanced hybrid programming models, Recent Advances in the Message Passing Interface, pp.50-61, 2010.
DOI : 10.1007/978-3-642-15646-5_6
URL : http://www.unixer.de/publications//img/hoefler-mprobe.pdf

É. Brunet, F. Trahay, and A. Denis, A Multicore-enabled Multirail Communication Engine, Proceedings of the IEEE International Conference on Cluster Computing, pp.316-321, 2008.
DOI : 10.1109/clustr.2008.4663788
URL : https://hal.archives-ouvertes.fr/inria-00327158

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, , 2009.

D. Liu, Z. Cui, S. Xu, and H. Liu, An empirical study on the performance of hash table, 2014 IEEE/ACIS 13th International Conference on Computer and Information Science (ICIS), pp.477-484, 2014.

B. Jenkins, Hash functions, Dr Dobb's Journal, 1997.

. Mpi-forum, MPI: A Message-Passing Interface Standard Version 3.1, 2015.

S. Friedman, N. Leidenfrost, B. C. Brodie, and R. K. Cytron, Hashtables for embedded and real-time systems, Proceedings of the IEEE Workshop on Real-Time Embedded Systems, p.2001, 2001.

A. Denis, pioman: a pthread-based Multithreaded Communication Engine, Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2015.
DOI : 10.1109/pdp.2015.78
URL : https://hal.archives-ouvertes.fr/hal-01087775

F. Trahay, É. Brunet, and A. Denis, An analysis of the impact of multi-threading on communication performance, CAC 2009: The 9th Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00381670

A. Denis and F. Trahay, MPI Overlap: Benchmark and Analysis, International Conference on Parallel Processing, ser. 45th International Conference on Parallel Processing, 2016.
DOI : 10.1109/icpp.2016.37
URL : https://hal.archives-ouvertes.fr/hal-01324179

, PM2 high performance runtime system

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, Euro-Par 2009, ser, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00384363