E. Agullo, A. Buttari, A. Guermouche, and F. Lopez, Multifrontal QR Factorization for Multicore Architectures over Runtime Systems, Euro-Par 2013 Parallel Processing, pp.521-532, 2013.
DOI : 10.1007/978-3-642-40047-6_53
URL : https://hal.archives-ouvertes.fr/hal-01220611

E. Agullo, A. Buttari, A. Guermouche, and F. Lopez, Task-Based Multifrontal QR Solver for GPU-Accelerated Multicore Architectures, 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), pp.54-63, 2015.
DOI : 10.1109/HiPC.2015.27
URL : https://hal.archives-ouvertes.fr/hal-01166312

E. Agullo, A. Buttari, A. Guermouche, and F. Lopez, Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems, ACM Transactions on Mathematical Software, vol.43, issue.2, pp.1-1322, 2016.
DOI : 10.1145/2898348
URL : https://hal.archives-ouvertes.fr/hal-01333645

E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak et al., Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series, vol.180, issue.1, p.12037, 2009.
DOI : 10.1088/1742-6596/180/1/012037

C. Augonnet, S. Thibault, R. Namyst, and P. A. Wacrenier, StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures . Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par, pp.187-198, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00550877

E. Ayguadé, R. M. Badia, F. D. Igual, J. Labarta, R. Mayo et al., An Extension of the StarSs Programming Model for Platforms with Multiple GPUs, Euro-Par, pp.851-862, 2009.
DOI : 10.1109/TPDS.2003.1214317

M. Bauer, S. Treichler, E. Slaughter, and A. Aiken, Legion: Expressing locality and independence with logical regions, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, p.66, 2012.
DOI : 10.1109/SC.2012.71
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.259.7715

G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, T. Hérault et al., PaRSEC: Exploiting Heterogeneity to Enhance Scalability, Computing in Science & Engineering, vol.15, issue.6, pp.36-45, 2013.
DOI : 10.1109/MCSE.2013.98

G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Luszczek et al., Dense linear algebra on distributed heterogeneous hardware with a symbolic DAG approach. Scalable Computing and Communications: Theory and Practice pp, pp.699-733, 2013.

A. Buttari, Fine-Grained Multithreading for the Multifrontal $QR$ Factorization of Sparse Matrices, SIAM Journal on Scientific Computing, vol.35, issue.4, pp.323-345, 2013.
DOI : 10.1137/110846427
URL : https://hal.archives-ouvertes.fr/hal-01122471

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.
DOI : 10.1016/j.parco.2008.10.002

T. A. Davis and Y. Hu, The university of Florida sparse matrix collection, ACM Transactions on Mathematical Software, vol.38, issue.1, pp.1-125, 2011.
DOI : 10.1145/2049662.2049663

D. Doerfler, J. Deslippe, S. Williams, L. Oliker, B. Cook et al., Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor, High Performance Computing -ISC High Performance 2016 International Workshops, pp.339-353, 2016.
DOI : 10.1190/geo2011-0238.1

I. S. Duff and J. K. Reid, The Multifrontal Solution of Indefinite Sparse Symmetric Linear, ACM Transactions on Mathematical Software, vol.9, issue.3, pp.302-325, 1983.
DOI : 10.1145/356044.356047

B. Hadri, H. Ltaief, E. Agullo, and J. Dongarra, Tile QR factorization with parallel panel processing for multicore architectures, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-10, 2010.
DOI : 10.1109/IPDPS.2010.5470443
URL : https://hal.archives-ouvertes.fr/inria-00548899

A. Haidar, S. Tomov, K. Arturov, M. Guney, S. Story et al., Programming model, performance analysis and optimization techniques for the Intel Knights Landing Xeon Phi, 2016 IEEE High Performance Extreme Computing Conference, pp.1-7, 2016.

K. Kim and V. Eijkhout, A Parallel Sparse Direct Solver via Hierarchical DAG Scheduling, ACM Transactions on Mathematical Software, vol.41, issue.1, pp.1-327, 2014.
DOI : 10.1145/2629641

X. Lacoste, M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014.
DOI : 10.1109/IPDPSW.2014.9
URL : https://hal.archives-ouvertes.fr/hal-00925017

T. M. Malas, T. Kurth, and J. Deslippe, Optimization of the sparse matrixvector products of an IDR Krylov iterative solver in emgeo for the intel KNL manycore processor In: High Performance Computing -ISC High Performance 2016 International Workshops Revised Selected Papers, pp.378-389, 2016.

A. Morari, R. Gioiosa, R. W. Wisniewski, B. S. Rosenburg, T. A. Inglett et al., Evaluating the Impact of TLB Misses on Future HPC Systems, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.1010-1021, 2012.
DOI : 10.1109/IPDPS.2012.94

G. Quintana-ortí, E. S. Quintana-ortí, R. A. Geijn, F. G. Zee, and E. Chan, Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3, 2009.
DOI : 10.1145/1527286.1527288

J. Reinders, Intel Threading Building Blocks: Outfitting C++ for Multi- Core Processor Parallelism, 2007.

C. Rosales, J. Cazes, K. Milfeld, A. Gómez-iglesias, L. Koesterke et al., A comparative study of application performance and scalability on the intel knights landing processor Revised Selected Papers, High Performance Computing -ISC High Performance 2016 International Workshops, pp.307-318, 2016.

E. Shmueli, G. Almasi, J. Brunheroto, J. Castanos, G. Dozsa et al., Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l, Proceedings of the 22nd annual international conference on Supercomputing , ICS '08, pp.165-174, 2008.
DOI : 10.1145/1375527.1375554

A. Sodani, Knights landing (KNL): 2nd Generation Intel?? Xeon Phi processor, 2015 IEEE Hot Chips 27 Symposium (HCS), pp.1-24, 2015.
DOI : 10.1109/HOTCHIPS.2015.7477467