Performance prediction and evaluation of parallel processing on a NUMA multiprocessor, IEEE Transactions on Software Engineering, vol.17, issue.10, pp.1059-1068, 1991. ,
DOI : 10.1109/32.99193
Evaluation of NUMA memory management through modeling and measurements, IEEE Transactions on Parallel and Distributed Systems, vol.3, issue.6, pp.686-701, 1992. ,
DOI : 10.1109/71.180624
On the importance of parallel application placement in NUMA multiprocessors . InProc, SEDMS IV, Symposium on Experiences with Distributed and Multiprocessor Systems, USENIX Association, pp.1-18, 1993. ,
Performance evaluation of hierarchical ring-based shared memory multiprocessors, IEEE Transactions on Computers, vol.43, issue.1, pp.52-67, 1994. ,
DOI : 10.1109/12.250609
What every programmer should know about memory, 2007. ,
A NUMA API for linux, Novell Inc, 2005. ,
Memory Affinity for Hierachical Shared Memory Multiprocessors, 21st International Symposium on Computer Architecture and High Performance Computing, pp.59-66, 2009. ,
DOI : 10.1109/sbac-pad.2009.16
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.596.4598
Local and remote memory: Memory in a Linux, NUMA system, 2006. ,
ForestGOMP: An Efficient OpenMP Environment for NUMA Architectures, International Journal of Parallel Programming, vol.62, issue.5-6, 2010. ,
DOI : 10.1007/s10766-010-0136-3
URL : https://hal.archives-ouvertes.fr/inria-00496295
Profiling Directed NUMA Optimization on Linux Systems: A Case Study of the Gaussian Computational Chemistry Code, 2011 IEEE International Parallel & Distributed Processing Symposium, pp.1046-1057, 2011. ,
DOI : 10.1109/IPDPS.2011.100
Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pp.87-96, 2010. ,
DOI : 10.1109/ISPASS.2010.5452060
Using Memory Access Traces to Map Threads and Data on Hierarchical Multi-core Platforms, IEEE International Parallel & Distributed Processing Symposium, pp.551-558, 2011. ,
Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors, 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC), pp.491-496, 2010. ,
DOI : 10.1109/HPCC.2010.114
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.259.2173
The maximum weight perfect matching problem for complete weighted graphs is in pc, Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, pp.880-887, 1990. ,
NUMA-ICTM: A parallel version of ICTM exploiting memory placement strategies for NUMA machines, 2009 IEEE International Symposium on Parallel & Distributed Processing, pp.1-8, 2009. ,
DOI : 10.1109/IPDPS.2009.5161155
URL : https://hal.archives-ouvertes.fr/hal-00788917
Memory-aware Thread and Data Mapping for Hierarchical Multi-core Platforms, International Journal of Networking and Computing, pp.97-116, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00953051
Understanding Off-Chip Memory Contention of Parallel Programs in Multicore Systems, 2011 International Conference on Parallel Processing, pp.602-611, 2011. ,
DOI : 10.1109/ICPP.2011.59
Multi-core aware process mapping and its impact on communication overhead of parallel applications, 2009 IEEE Symposium on Computers and Communications, pp.811-817, 2009. ,
DOI : 10.1109/ISCC.2009.5202271
Locality-Aware Parallel Process Mapping for Multi-core HPC Systems, 2011 IEEE International Conference on Cluster Computing, pp.527-531, 2011. ,
DOI : 10.1109/CLUSTER.2011.59
Instruction-Based Sampling: A New Performance Analysis Technique for AMD Family 10h Processors, 2007. ,