S. Blagodurov, S. Zhuravlev, M. Dashti, and A. Fedorova, A case for NUMA-aware contention management on multicore systems, Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pp.1-1, 2011.
DOI : 10.1145/1854273.1854350

M. Dashti, A. Fedorova, J. Funston, F. Gaud, R. Lachaize et al., Traffic management: A holistic approach to memory placement on numa systems, Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pp.381-394, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00945758

J. Dongarra, K. London, S. Moore, P. Mucci, and D. Terpstra, Using papi for hardware performance monitoring on linux systems, Conference on Linux Clusters: The HPC Revolution, Linux Clusters Institute, 2001.

J. Paul and . Drongowski, Instruction-based sampling: A new performance analysis technique for amd family 10h processors, 2007.

R. Lachaize, B. Lepers, and V. Quéma, Memprof: A memory profiler for numa multicore systems, Proceedings of the 2012 USENIX Conference on Annual Technical Conference, USENIX ATC'12, pp.5-5
URL : https://hal.archives-ouvertes.fr/hal-00945731

X. Liu and J. Mellor-crummey, A tool to analyze the performance of multithreaded programs on numa architectures, Proceedings of the 19th ACM SIG- PLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '14, pp.259-272, 2014.

I. Lopez, S. Moore, and V. Weaver, A prototype sampling interface for PAPI, Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure, XSEDE '15, pp.1-27, 2015.
DOI : 10.1145/2792745.2792772

V. Baptiste-lepers, A. Quema, and . Fedorova, Thread and memory placement on numa systems: Asymmetry matters, 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp.277-289, 2015.

X. Liu and B. Wu, ScaAnalyzer, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '15, pp.1-47, 2015.
DOI : 10.1145/2807591.2807648

D. [. Molka, R. Hackenberg, W. E. Schone, and . Nagel, Cache Coherence Protocol and Memory Performance of the Intel Haswell-EP Architecture, 2015 44th International Conference on Parallel Processing, pp.739-748, 2015.
DOI : 10.1109/ICPP.2015.83

J. [. Mccurdy and . Vetter, Memphis: Finding and fixing NUMA-related performance problems on multi-core platforms, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pp.87-96, 2010.
DOI : 10.1109/ISPASS.2010.5452060

M. Selva, Performance Monitoring of Throughput Constrained Dataflow Programs Executed On Shared-Memory Multi-core Architectures, Theses, Institut National des Sciences Appliquées de Lyon, 2015.
URL : https://hal.archives-ouvertes.fr/tel-01264258

A. Herve-yviquel, K. Lorence, G. Jerbi, A. Cocherel, M. Sanchez et al., Orcc: Multimedia development made easy, Proceedings of the 21st ACM International Conference on Multimedia, MM '13, pp.863-866, 2013.