H. Sutter, A fundamental turn toward concurrency in software, Dr. Dobb's Journal, vol.30, issue.3, 2005.

M. Frigo and S. Johnson, FFTW: an adaptive software architecture for the FFT, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181), pp.1381-1384, 1998.
DOI : 10.1109/ICASSP.1998.681704

A. Jee-whan-choi, R. W. Singh, and . Vuduc, Model-driven autotuning of sparse matrix-vector multiply on GPUs, Proc. ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP), 2010.

J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao et al., Petabricks: A language and compiler for algorithmic choice, ACM SIGPLAN Conference on Programming Language Design and Implementation, 2009.

C. Chan, J. Ansel, Y. L. Wong, S. Amarasinghe, and A. Edelman, Autotuning multigrid with PetaBricks, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, 2009.
DOI : 10.1145/1654059.1654065

R. , C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations of software and the atlas project, Parallel Computing, vol.27, issue.12, pp.3-35, 2001.

V. Volkov and J. W. Demmel, Benchmarking GPUs to tune dense linear algebra, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-11, 2008.
DOI : 10.1109/SC.2008.5214359

S. Tomov, R. Nath, H. Ltaief, and J. Dongarra, Dense linear algebra solvers for multicore with GPU accelerators, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010.
DOI : 10.1109/IPDPSW.2010.5470941

G. Quintana-ortí, E. Quintana-ortí, R. Van-de-geijn, F. Van-zee, and E. Chan, Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3, 2009.
DOI : 10.1145/1527286.1527288

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.
DOI : 10.1016/j.parco.2008.10.002

. Intel, http://www.intel.com/software/products, Math Kernel Library (MKL)

E. Agullo, B. Hadri, H. Ltaief, and J. Dongarra, Comparative study of one-sided factorizations with multiple software packages on multi-core hardware, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, 2009.
DOI : 10.1145/1654059.1654080

N. Christofides, Graph Theory: An algorithmic Approach, 1975.

C. Whaley and M. Castaldo, Achieving accurate and context-sensitive timing for code optimization. Software: Practice and Experience, pp.1621-1642, 2008.

J. W. Demmel, L. Grigori, M. F. Hoemmen, and J. Langou, Communication-optimal Parallel and Sequential QR and LU Factorizations, SIAM Journal on Scientific Computing, vol.34, issue.1, 2008.
DOI : 10.1137/080731992

URL : https://hal.archives-ouvertes.fr/hal-00870930

B. Hadri, H. Ltaief, E. Agullo, and J. Dongarra, Tile QR factorization with parallel panel processing for multicore architectures, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010.
DOI : 10.1109/IPDPS.2010.5470443

URL : https://hal.archives-ouvertes.fr/inria-00548899

E. Agullo, C. Augonnet, J. Dongarra, M. Faverge, H. Ltaief et al., QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators, 2011 IEEE International Parallel & Distributed Processing Symposium, 2011.
DOI : 10.1109/IPDPS.2011.90

URL : https://hal.archives-ouvertes.fr/inria-00547614

G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier et al., DAGuE: A generic distributed dag engine for high performance computing, 2010.

I. Centre-de-recherche, ?. Grenoble, and . Rhône-alpes, Europe -38334 Montbonnot Saint-Ismier Centre de recherche INRIA Lille ? Nord Europe : Parc Scientifique de la Haute Borne -40, avenue Halley -59650 Villeneuve d'Ascq Centre de recherche INRIA Nancy ? Grand Est : LORIA, Technopôle de Nancy-Brabois -Campus scientifique 615, rue du Jardin Botanique -BP 101 -54602 Villers-lès-Nancy Cedex Centre de recherche INRIA Paris ? Rocquencourt : Domaine de Voluceau -Rocquencourt -BP 105 -78153 Le Chesnay Cedex Centre de recherche INRIA Rennes ? Bretagne Atlantique : IRISA, Campus universitaire de Beaulieu -35042 Rennes Cedex Centre de recherche INRIA Saclay ? Île-de-France, des Vignes : 4, rue Jacques Monod -91893 Orsay Cedex Centre de recherche INRIA, pp.105-78153, 2004.