G. Aupy, A. Benoit, L. Pottier, P. Raghavan, Y. Robert et al., Co-Scheduling Algorithms for Cache-Partitioned Systems, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
DOI : 10.1109/IPDPSW.2017.60
URL : https://hal.archives-ouvertes.fr/hal-01654660

D. H. Bailey, The NAS parallel benchmarks---summary and preliminary results, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.158-165, 1991.
DOI : 10.1145/125826.125925

S. Blagodurov, S. Zhuravlev, and A. Fedorova, Contention-Aware Scheduling on Multicore Systems, ACM Transactions on Computer Systems, vol.28, issue.4, pp.1-8, 2010.
DOI : 10.1145/1880018.1880019

B. D. Bui, M. Caccamo, L. Sha, and J. Martinez, Impact of Cache Partitioning on Multi-tasking Real Time Embedded Systems, 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, pp.101-110, 2008.
DOI : 10.1109/RTCSA.2008.42

D. Dauwe, E. Jonardi, R. Friese, S. Pasricha, A. A. Maciejewski et al., A Methodology for Co-Location Aware Application Performance Modeling in Multicore Computing, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp.434-443, 2015.
DOI : 10.1109/IPDPSW.2015.38

J. Dongarra, Report on the sunway taihulight system PDF). www. netlib. org, 2016.

T. Dwyer, A. Fedorova, S. Blagodurov, M. Roth, F. Gaud et al., A practical method for estimating performance degradation on multicore processors, and its application to HPC workloads, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-8311, 2012.
DOI : 10.1109/SC.2012.11

A. Gainaru, G. Aupy, A. Benoit, F. Cappello, Y. Robert et al., Scheduling the I/O of HPC Applications Under Congestion, 2015 IEEE International Parallel and Distributed Processing Symposium, pp.1013-1022, 2015.
DOI : 10.1109/IPDPS.2015.116
URL : https://hal.archives-ouvertes.fr/hal-01251938

M. R. Garey and D. S. Johnson, Computers and Intractability, a Guide to the Theory of NP-Completeness, 1979.

N. Guan, M. Stigge, W. Yi, and G. Yu, Cache-aware scheduling and analysis for multicores, Proceedings of the seventh ACM international conference on Embedded software, EMSOFT '09, pp.245-254, 2009.
DOI : 10.1145/1629335.1629369

A. Hartstein, V. Srinivasan, T. Puzak, and P. Emma, On the nature of cache miss behavior: Is it ? 2, The Journal of Instruction-Level Parallelism, vol.10, pp.1-22, 2008.

L. He, H. Zhu, and S. A. Jarvis, Developing Graph-Based Co-Scheduling Algorithms on Multicore Computers, IEEE Transactions on Parallel and Distributed Systems, vol.27, issue.6, pp.1617-1632, 2016.
DOI : 10.1109/TPDS.2015.2468223

. Intel, Intel 64 and IA-32 architectures software developer's manual, 2014.

Y. Jiang, X. Shen, J. Chen, and R. Tripathi, Analysis and approximation of optimal co-scheduling on chip multiprocessors, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, pp.220-229, 2008.
DOI : 10.1145/1454115.1454146

A. Krishna, A. Samih, and Y. Solihin, Data sharing in multi-threaded applications and its impact on chip design, 2012 IEEE International Symposium on Performance Analysis of Systems & Software, pp.125-134, 2012.
DOI : 10.1109/ISPASS.2012.6189219

E. Kultursay, M. Kandemir, A. Sivasubramaniam, and O. Mutlu, Evaluating STT-RAM as an energy-efficient main memory alternative, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp.256-267, 2013.
DOI : 10.1109/ISPASS.2013.6557176

M. A. Laurenzano, M. M. Tikir, L. Carrington, and A. Snavely, PEBIL: Efficient static binary instrumentation for Linux, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pp.175-183, 2010.
DOI : 10.1109/ISPASS.2010.5452024

J. Leverich and C. Kozyrakis, Reconciling high server utilization and sub-millisecond quality-of-service, Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, p.4, 2014.
DOI : 10.1145/2592798.2592821

D. Lo, L. Cheng, R. Govindaraju, P. Ranganathan, and C. Kozyrakis, Improving Resource Efficiency at Scale with Heracles, ACM Transactions on Computer Systems, vol.34, issue.2, p.6, 2016.
DOI : 10.1109/MICRO.2014.53

D. Molka, D. Hackenberg, R. Schone, and W. E. Nagel, Cache Coherence Protocol and Memory Performance of the Intel Haswell-EP Architecture, 2015 44th International Conference on Parallel Processing, pp.739-748, 2015.
DOI : 10.1109/ICPP.2015.83

S. P. Muralidhara, L. Subramanian, O. Mutlu, M. Kandemir, and T. Moscibroda, Reducing memory interference in multicore systems via application-aware memory channel partitioning, Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-44 '11, pp.374-385, 2011.
DOI : 10.1145/2155620.2155664

A. J. Pena and P. Balaji, Toward the efficient use of multiple explicitly managed memory subsystems, 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp.123-131, 2014.
DOI : 10.1109/CLUSTER.2014.6968756

M. K. Qureshi and Y. N. Patt, Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06), pp.423-432, 2006.
DOI : 10.1109/MICRO.2006.49

B. M. Rogers, A. Krishna, G. B. Bell, K. Vu, X. Jiang et al., Scaling the bandwidth wall: challenges in and avenues for CMP scaling, ACM SIGARCH Computer Architecture News

C. Sewell, K. Heitmann, H. Finkel, G. Zagaris, S. T. Parete-koon et al., Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '15, p.50, 2015.
DOI : 10.1145/2535571.2535592

K. Tian, Y. Jiang, and X. Shen, A study on optimally co-scheduling jobs of different lengths on chip multiprocessors, Proceedings of the 6th ACM conference on Computing frontiers, CF '09, pp.41-50, 2009.
DOI : 10.1145/1531743.1531752

Y. Zhang, M. A. Laurenzano, J. Mars, and L. Tang, SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp.406-418, 2014.
DOI : 10.1109/MICRO.2014.53

H. Zhu, L. He, B. Gao, K. Li, J. Sun et al., Modelling and Developing Co-scheduling Strategies on Multicore Processors, 2015 44th International Conference on Parallel Processing, pp.220-229, 2015.
DOI : 10.1109/ICPP.2015.31

S. Zhuravlev, S. Blagodurov, and A. Fedorova, Addressing shared resource contention in multicore processors via scheduling, ACM Sigplan Notices, pp.129-142, 2010.