Rank reordering for MPI communication optimization, Computers & Fluids, vol.80, 2012. ,
DOI : 10.1016/j.compfluid.2012.01.019
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010. ,
DOI : 10.1109/PDP.2010.67
URL : https://hal.archives-ouvertes.fr/inria-00429889
LogP: Towards a Realistic Model of Parallel Computation. SIGPLAN Not, pp.1-12, 1993. ,
Netloc: Towards a Comprehensive View of the HPC System Topology, 2014 43rd International Conference on Parallel Processing Workshops, pp.216-225, 2014. ,
DOI : 10.1109/ICPPW.2014.38
URL : https://hal.archives-ouvertes.fr/hal-01010599
Rank reordering strategy for MPI topology creation functions, Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.188-195, 1998. ,
DOI : 10.1007/BFb0056575
The communication challenge for MPP: Intel Paragon and Meiko CS-2, Parallel Computing, vol.20, issue.3, pp.3-389, 1994. ,
DOI : 10.1016/S0167-8191(06)80021-9
Locality-Aware Parallel Process Mapping for Multi-core HPC Systems, 2011 IEEE International Conference on Cluster Computing, pp.527-531, 2011. ,
DOI : 10.1109/CLUSTER.2011.59
Implementing the MPI Process Topology Mechanism, ACM/IEEE SC 2002 Conference (SC'02), pp.1-14, 2002. ,
DOI : 10.1109/SC.2002.10045
Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-00921605
Implementing the MPI process topology mechanism, Supercomputing '02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pp.1-14, 2002. ,
Fast Measurement of LogP Parameters for Message Passing Platforms, pp.1176-1183, 2000. ,
DOI : 10.1007/3-540-45591-4_162
A NUMA API for Linux, Novel Inc, 2005. ,
Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments, EuroPVM/MPI, pp.104-115, 2009. ,
DOI : 10.1109/PDP.2009.43
URL : https://hal.archives-ouvertes.fr/inria-00392581
Improving MPI Applications Performance on Multicore Clusters with Rank Reordering, In EuroMPI (Lecture Notes in Computer Science), vol.49, pp.39-49, 2011. ,
DOI : 10.1145/1183401.1183451
URL : https://hal.archives-ouvertes.fr/hal-00643151
Plate-forme Fédérative pour la Recherche en Informatique et Mathématiques ,
Hierarchical Parallel Matrix Multiplication on Large-Scale Distributed Memory Platforms, 2013 42nd International Conference on Parallel Processing, pp.754-76289, 2013. ,
DOI : 10.1109/ICPP.2013.89
Multi-core and Network Aware MPI Topology Functions, EuroMPI 2011. Recent Advances in the Message Passing Interface -18th European MPI Users' Group Meeting, pp.50-60, 2011. ,
DOI : 10.1109/PDP.2010.67
High Performance Parallelism Pearls, 2015. ,
SUMMA: scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience 9, 4<255::AID-CPE250>3.0. CO, pp.255-2741096, 1997. ,
Process Mapping for MPI Collective Communications, Lecture Notes in Computer Science), vol.8, issue.11, pp.81-92, 2009. ,
DOI : 10.1109/ICPP.2005.62
Hierarchical Collectives in MPICH2, Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.325-326, 2009. ,
DOI : 10.1109/JSSC.2007.910957