F. Broquedis, J. Clet-ortega, S. Moreaud, N. Furmento, B. Goglin et al., hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.180-186, 2010.
DOI : 10.1109/PDP.2010.67

URL : https://hal.archives-ouvertes.fr/inria-00429889

A. Szalay, A. Bunn, J. Gray, I. Foster, and I. Raicu, The importance of data locality in distributed computing applications, NSF Workflow Workshop, 2006.

M. Steckermeier and F. Bellosa, Using locality information in userlevel scheduling, 1995.

S. Moreaud and B. Goglin, Impact of NUMA Effects on High- Speed Networking with Multi-Opteron Machines, Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems, pp.24-29, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00175747

F. Song, S. Moore, and J. Dongarra, Feedback-directed thread scheduling with memory considerations, Proceedings of the 16th international symposium on High performance distributed computing , HPDC '07, 2007.
DOI : 10.1145/1272366.1272380

S. Kim, D. Chandra, and Y. Solihin, Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture, Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT '04), pp.111-122, 2004.

T. Hoefler and M. Snir, Generic topology mapping strategies for large-scale parallel architectures, Proceedings of the international conference on Supercomputing, ICS '11, pp.75-85, 2011.
DOI : 10.1145/1995896.1995909

H. Subramoni, S. Potluri, K. Kandalla, B. Barth, J. Vienne et al., Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, 2012.
DOI : 10.1109/SC.2012.47

E. Jeannot, G. Mercier, and F. Tessier, Process placement in multicore clusters: Algorithmic issues and practical techniques, IEEE Transactions on Parallel and Distributed Systems, vol.99, issue.PrePrints, p.1, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00803548

D. Buntinas, B. Goglin, D. Goodell, G. Mercier, and S. Moreaud, Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis, 2009 International Conference on Parallel Processing, pp.462-469, 2009.
DOI : 10.1109/ICPP.2009.22

URL : https://hal.archives-ouvertes.fr/inria-00390064

T. Ma, G. Bosilca, A. Bouteiller, and J. J. Dongarra, Locality and Topology Aware Intra-node Communication among Multicore CPUs, Proceedings of the 17th European MPI Users Group Conference ser. Lecture Notes in Computer Science, 2010.
DOI : 10.1007/978-3-642-15646-5_28

S. Moreaud, B. Goglin, and R. Namyst, Adaptive MPI Multirail Tuning for Non-uniform Input/Output Access, Proceedings of the 17th European MPI Users Group Conference ser. Lecture Notes in Computer Science, 2010.
DOI : 10.1007/978-3-642-15646-5_25

URL : https://hal.archives-ouvertes.fr/inria-00486178

B. Goglin and S. Moreaud, Dodging Non-uniform I/O Access in Hierarchical Collective Operations for Multicore Clusters, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, 2011.
DOI : 10.1109/IPDPS.2011.222

URL : https://hal.archives-ouvertes.fr/inria-00566246

J. Hursey and J. M. Squyres, Advancing application process affinity experimentation, Proceedings of the 20th European MPI Users' Group Meeting on, EuroMPI '13, pp.163-168, 2013.
DOI : 10.1145/2488551.2488603

B. Goglin, J. Hursey, and J. M. Squyres, Netloc: Towards a Comprehensive View of the HPC System Topology, 2014 43rd International Conference on Parallel Processing Workshops, 2014.
DOI : 10.1109/ICPPW.2014.38

URL : https://hal.archives-ouvertes.fr/hal-01010599