G. Blake, R. G. Dreslinski, and T. Mudge, A survey of multicore processors, IEEE Signal Processing Magazine, vol.26, issue.6, pp.26-37, 2009.
DOI : 10.1109/MSP.2009.934110

S. Blagodurov, S. Zhuravlev, M. Dashti, and A. Fedorova, A case for NUMA-aware contention management on multicore systems, Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, 2011.
DOI : 10.1145/1854273.1854350

D. Ziakas, A. Baum, R. A. Maddox, and R. J. Safranek, Intel R QuickPath Interconnect Architectural Features Supporting Scalable System Architectures In: High Performance Interconnects (HOTI) Bull atos technologies: Bull coherent switch, IEEE 18th Annual Symposium on, pp.1-6, 2010.

A. Ilic, F. Pratas, and L. Sousa, Cache-aware Roofline model: Upgrading the loft, IEEE Computer Architecture Letters, vol.13, issue.1, pp.21-24, 2014.
DOI : 10.1109/L-CA.2013.6

S. Williams, A. Waterman, and D. Patterson, Roofline, Communications of the ACM, vol.52, issue.4, pp.65-76, 2009.
DOI : 10.1145/1498765.1498785

C. Cantalupo, V. Venkatesan, J. Hammond, K. Czurlyo, and S. D. Hammond, memkind: An Extensible Heap Memory Manager for Heterogeneous Memory Platforms and Mixed Memory Policies, Sandia National Laboratories (SNL-NM), 2015.

F. Broquedis, J. Clet-ortega, S. Moreaud, N. Furmento, B. Goglin et al., hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010.
DOI : 10.1109/PDP.2010.67

URL : https://hal.archives-ouvertes.fr/inria-00429889

A. Kleen, A NUMA API for LINUX, Novel Inc, 2005.

B. Lepers, V. Quema, and A. Fedorova, Thread and Memory Placement on NUMA Systems: Asymmetry Matters, 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp.277-289, 2015.

C. Chou, A. Jaleel, and M. K. Qureshi, CAMEO: A Two-Level Memory Organization with Capacity of Main Memory and Flexibility of Hardware-Managed Cache, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp.1-12, 2014.
DOI : 10.1109/MICRO.2014.63

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter et al., The nas parallel benchmarks, The International Journal of Supercomputer Applications, 1991.

I. Karlin, J. Keasler, and R. Neely, Lulesh 2.0 updates and changes, 2013.
DOI : 10.2172/1090032

S. Ramos and T. Hoefler, Capability Models for Manycore Memory Systems: A Case- Study with Xeon Phi KNL 17. : The Memkind Library. http://memkind.github.io/memkind 18 Beyond the Roofline: Cache-Aware Power and Energy-Efficiency Modeling for Multi-Cores, IEEE Transactions on Computers, vol.66, issue.1, pp.52-58, 2017.

D. Doerfler, J. Deslippe, S. Williams, L. Oliker, B. Cook et al., Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor, International Conference on High Performance Computing, pp.339-353, 2016.
DOI : 10.1145/1498765.1498785

O. G. Lorenzo, T. F. Pena, J. C. Cabaleiro, J. C. Pichel, and F. F. Rivera, Using an extended Roofline Model to understand data and thread affinities on NUMA systems, Annals of Multicore and GPU Programming, vol.1, issue.1, pp.56-67, 2014.

J. Hofmann, J. Eitzinger, and D. Fey, Execution-Cache-Memory Performance Model: Introduction and Validation, p.3118, 2015.

D. Marques, H. Duarte, A. Ilic, L. Sousa, R. Belenov et al., Performance Analysis with Cache-Aware Roofline Model in Intel Advisor, 2017 International Conference on High Performance Computing & Simulation (HPCS), pp.898-907, 2017.
DOI : 10.1109/HPCS.2017.150