Data Transfers between Processes in an SMP System: Performance Study and Application to MPI, 2006 International Conference on Parallel Processing (ICPP'06), pp.487-496, 2006. ,
DOI : 10.1109/ICPP.2006.31
SMARTMAP: Operating system support for efficient data sharing among processes on a multi-core processor, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, 2008. ,
DOI : 10.1109/SC.2008.5218881
Designing an Efficient Kernel-Level and User-Level Hybrid Approach for MPI Intra-Node Communication on Multi-Core Systems, 2008 37th International Conference on Parallel Processing, 2008. ,
DOI : 10.1109/ICPP.2008.16
KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework, Journal of Parallel and Distributed Computing, vol.73, issue.2, pp.176-188, 2013. ,
DOI : 10.1016/j.jpdc.2012.09.016
URL : https://hal.archives-ouvertes.fr/hal-00731714
Cross Memory Attach, 2010. ,
Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis, 2009 International Conference on Parallel Processing, pp.462-469, 2009. ,
DOI : 10.1109/ICPP.2009.22
URL : https://hal.archives-ouvertes.fr/inria-00390064
Locality and Topology Aware Intra-node Communication among Multicore CPUs, Proceedings of the 17th European MPI Users Group Conference, ser. Lecture Notes in Computer Science, 2010. ,
DOI : 10.1007/978-3-642-15646-5_28
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.4377
Optimizing MPI communication within large multicore nodes with kernel assistance, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010. ,
DOI : 10.1109/IPDPSW.2010.5470849
URL : https://hal.archives-ouvertes.fr/inria-00451471
A benchmark-based performance model for memory-bound HPC applications, 2014 International Conference on High Performance Computing & Simulation (HPCS), 2014. ,
DOI : 10.1109/HPCSim.2014.6903790
URL : https://hal.archives-ouvertes.fr/hal-00985598
A low-overhead coherence solution for multiprocessors with private cache memories, ACM SIGARCH Computer Architecture News, vol.12, issue.3, pp.348-354, 1984. ,
DOI : 10.1145/773453.808204
Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem, Recent Advances in Parallel Virtual Machine and Message Passing Interface: Proc. 13th European PVM/MPI Users Group Meeting, 2006. ,
DOI : 10.1007/11846802_19
URL : https://hal.archives-ouvertes.fr/hal-00344339
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation, Proceedings, 11th European PVM/MPI Users' Group Meeting, pp.97-104, 2004. ,
DOI : 10.1007/978-3-540-30218-6_19
On the Effects of CPU Caches on MPI Point-to-Point Communications, 2012 IEEE International Conference on Cluster Computing, pp.495-503, 2012. ,
DOI : 10.1109/CLUSTER.2012.22
Locating cache performance bottlenecks using data profiling, Proceedings of the 5th European conference on Computer systems, EuroSys '10, pp.335-348, 2010. ,
DOI : 10.1145/1755913.1755947
A High Performance Superpipeline Protocol for InfiniBand, Proceedings of the 17th International Euro- Par Conference, pp.276-287, 2011. ,
DOI : 10.1007/978-3-642-23397-5_27
URL : https://hal.archives-ouvertes.fr/inria-00586015
A Tool for Optimizing Runtime Parameters of Open MPI, Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp.210-217, 2008. ,
DOI : 10.1007/BFb0056559
Optimizing MPI Runtime Parameter Settings by Using Machine Learning, EuroPVM/MPI, ser, pp.196-206, 2009. ,
DOI : 10.1007/978-3-642-03770-2_26
Modeling Communication in Cache-Coherent SMP Systems -A Case-Study with Xeon Phi, Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, pp.6-2013 ,