E. Agullo, J. Dongarra, B. Hadri, J. Kurzak, J. Langou et al., PLASMA Users' Guide, 2009.

E. Agullo, B. Hadri, H. Ltaief, and J. Dongarrra, Comparative study of one-sided factorizations with multiple software packages on multi-core hardware, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pp.1-12, 2009.
DOI : 10.1145/1654059.1654080

R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures: A Dependence-based Approach, 2001.

P. Bientinesi, B. Gunter, and R. Van-de-geijn, Families of algorithms related to the inversion of a Symmetric Positive Definite matrix, ACM Transactions on Mathematical Software, vol.35, issue.1, pp.1-22, 2008.
DOI : 10.1145/1377603.1377606

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, Parallel tiled QR factorization for multicore architectures, Concurrency and Computation: Practice and Experience, vol.21, issue.8, pp.1573-1590, 2008.
DOI : 10.1002/cpe.1301

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.
DOI : 10.1016/j.parco.2008.10.002

E. Chan, Runtime data flow scheduling of matrix computations. FLAME Working Note #39, 2009.

E. Chan, F. G. Van-zee, P. Bientinesi, E. S. Quintana-ortí, G. Quintana-ortí et al., SuperMatrix, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming , PPoPP '08, pp.123-132, 2008.
DOI : 10.1145/1345206.1345227

N. Christofides, Graph Theory: An algorithmic Approach, 1975.

R. Eigenmann, J. Hoeflinger, and D. Padua, On the automatic parallelization of the Perfect Benchmarks(R), IEEE Transactions on Parallel and Distributed Systems, vol.9, issue.1, pp.5-23, 1998.
DOI : 10.1109/71.655238

N. J. Higham, Accuracy and Stability of Numerical Algorithms, Society for Industrial and Applied Mathematics, 2002.
DOI : 10.1137/1.9780898718027

J. Kurzak and J. Dongarra, Fully dynamic scheduler for numerical computing on multicore processors, 2009.

J. Kurzak and J. Dongarra, QR factorization for the, Cell Broadband Engine. Sci. Program, vol.17, issue.12, pp.31-42, 2009.

J. M. Perez, R. Badia, and J. Labarta, A dependency-aware task-based programming environment for multi-core architectures, 2008 IEEE International Conference on Cluster Computing, 2008.
DOI : 10.1109/CLUSTR.2008.4663765

G. Quintana-ortí, E. S. Quintana-ortí, R. A. Van-de-geijn, F. G. Van-zee, and E. Chan, Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3
DOI : 10.1145/1527286.1527288

M. C. Rinard, D. J. Scales, and M. S. Lam, Jade: a high-level, machine-independent language for parallel programming, Computer, vol.26, issue.6, pp.28-38, 1993.
DOI : 10.1109/2.214440

H. Sutter, A fundamental turn toward concurrency in software, Dr. Dobb's Journal, vol.30, issue.3, 2005.

G. Field and . Van-zee, libflame: The Complete Reference, 2009.