G. Blake, R. G. Dreslinski, and T. Mudge, A survey of multicore processors, IEEE Signal Processing Magazine, vol.26, issue.6, pp.26-37, 2009.

S. Blagodurov, S. Zhuravlev, M. Dashti, and A. Fedorova, A case for numa-aware contention management on multicore systems, 2011 USENIX Annual Technical Conference, 2011.

J. Reinders, J. Jeffers, and A. Sodani, Intel xeon phi processor high performance programming knights landing edition, 2016.

D. Ziakas, A. Baum, R. A. Maddox, and R. J. Safranek, Intel R quickpath interconnect architectural features supporting scalable system architectures, High Performance Interconnects (HOTI), 2010 IEEE 18th Annual Symposium on, pp.1-6, 2010.

, Bull atos technologies: Bull coherent switch

A. Ilic, F. Pratas, and L. Sousa, Cache-aware roofline model: Upgrading the loft, IEEE Computer Architecture Letters, vol.13, issue.1, pp.21-24, 2014.

N. Denoyelle, B. Goglin, A. Ilic, E. Jeannot, and L. Sousa, High Performance Computing systems-Performance Modeling, Benchmarking, and Simulation-8th International Workshop, PMBS 2017, ser, Lecture Notes in Computer Science, vol.10724, pp.91-113, 2017.

S. Williams, A. Waterman, and D. Patterson, Roofline: An insightful visual performance model for multicore architectures, Commun. ACM, vol.52, issue.4, pp.65-76, 2009.
DOI : 10.2172/1407078

URL : https://www.osti.gov/servlets/purl/1407078

C. Cantalupo, V. Venkatesan, J. Hammond, K. Czurlyo, and S. D. Hammond, memkind: An extensible heap memory manager for heterogeneous memory platforms and mixed memory policies, Sandia National Laboratories (SNL-NM), 2015.

F. Broquedis, J. Clet-ortega, S. Moreaud, N. Furmento, B. Goglin et al., hwloc: a generic framework for managing hardware affinities in hpc applications, PDP 2010-The 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, 2010.
DOI : 10.1109/pdp.2010.67

URL : https://hal.archives-ouvertes.fr/inria-00429889

A. Kleen, A numa api for linux, 2005.

B. Lepers, V. Quema, and A. Fedorova, Thread and memory placement on numa systems: Asymmetry matters, 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp.277-289, 2015.

C. Chou, A. Jaleel, and M. K. Qureshi, Cameo: A twolevel memory organization with capacity of main memory and flexibility of hardware-managed cache, Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-47, pp.1-12, 2014.

B. Lepers, V. Quéma, and A. Fedorova, Thread and memory placement on numa systems: Asymmetry matters, USENIX Annual Technical Conference, pp.277-289, 2015.

S. Ramos and T. Hoefler, Capability models for manycore memory systems: A case-study with xeon phi knl, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.297-306, 2017.
DOI : 10.1109/ipdps.2017.30

, The memkind library

A. Ilic, F. Pratas, and L. Sousa, Beyond the roofline: Cacheaware power and energy-efficiency modeling for multi-cores, IEEE Transactions on Computers, vol.66, issue.1, pp.52-58, 2017.

D. Doerfler, J. Deslippe, S. Williams, L. Oliker, B. Cook et al., Applying the roofline performance model to the intel xeon phi knights landing processor, International Conference on High Performance Computing, pp.339-353, 2016.

O. G. Lorenzo, T. F. Pena, J. C. Cabaleiro, J. C. Pichel, and F. F. Rivera, Using an extended roofline model to understand data and thread affinities on numa systems, Annals of Multicore and GPU Programming, vol.1, issue.1, pp.56-67, 2014.

J. Hofmann, J. Eitzinger, and D. Fey, Execution-cache-memory performance model: Introduction and validation, CoRR, 2015.

. Intel, Intel R advisor roofline, 2017.

D. Marques, H. Duarte, A. Ilic, L. Sousa, R. Belenov et al., Performance analysis with cache-aware roofline model in intel advisor, 2017 International Conference on High Performance Computing Simulation (HPCS), pp.898-907, 2017.