G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier et al., DAGuE: A generic distributed DAG engine for high performance computing, 2010.

G. W. Stewart, Matrix algorithms, Society for Industrial and Applied Mathematics, 2001.

A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra, Parallel tiled QR factorization for multicore architectures, Concurrency and Computation: Practice and Experience, vol.21, issue.8, pp.1573-1590, 2008.
DOI : 10.1002/cpe.1301

A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.
DOI : 10.1016/j.parco.2008.10.002

E. S. Quintana-ortí and R. A. Van-de-geijn, Updating an LU Factorization with Pivoting, ACM Transactions on Mathematical Software, vol.35, issue.2, 2008.
DOI : 10.1145/1377612.1377615

J. Kurzak, A. Buttari, and J. J. Dongarra, Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization, IEEE Transactions on Parallel and Distributed Systems, vol.19, issue.9, pp.1175-1186, 2008.
DOI : 10.1109/TPDS.2007.70813

J. Kurzak and J. J. Dongarra, QR factorization for the CELL processor, Scientific Programming, pp.31-42, 2009.

F. G. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development, vol.41, issue.6, pp.737-756, 1997.
DOI : 10.1147/rd.416.0737

F. G. Gustavson, New Generalized Matrix Data Structures Lead to a Variety of High-Performance Algorithms, Proceedings of the IFIP TC2/WG2.5 Working Conference on the Architecture of Scientific Software, pp.211-234, 2000.
DOI : 10.1007/978-0-387-35407-1_13

F. G. Gustavson, J. A. Gunnels, and J. C. Sexton, Minimal Data Copy for Dense Linear Algebra Factorization, Applied Parallel Computing, State of the Art in Scientific Computing, 8th International Workshop, pp.540-549, 2006.
DOI : 10.1007/978-3-540-75755-9_66

E. Elmroth, F. G. Gustavson, I. Jonsson, and B. Kågström, Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software, SIAM Review, vol.46, issue.1, pp.3-45, 2004.
DOI : 10.1137/S0036144503428693

M. Cosnard and M. Loi, Automatic task graph generation techniques, HICSS '95: Proceedings of the 28th Hawaii International Conference on System Sciences, p.113, 1995.

M. Cosnard and E. Jeannot, Compact DAG Representation and Its Dynamic Scheduling, Journal of Parallel and Distributed Computing, vol.58, issue.3, pp.487-514, 1999.
DOI : 10.1006/jpdc.1999.1566
URL : https://hal.archives-ouvertes.fr/inria-00098841

M. Cosnard, E. Jeannot, and T. Yang, Compact DAG representation and its symbolic scheduling, Journal of Parallel and Distributed Computing, vol.64, issue.8, pp.921-935, 2004.
DOI : 10.1016/j.jpdc.2004.05.001
URL : https://hal.archives-ouvertes.fr/inria-00099958

F. Song, A. Yarkhan, and J. Dongarra, Dynamic task scheduling for linear algebra algorithms on distributedmemory multicore systems, SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp.1-11, 2009.

F. Song21, ]. L. Blackford, J. Choi, A. Cleary, E. D. Azevedo et al., Static and dynamic scheduling for effective use of multicore systems, ScaLAPACK Users' Guide. SIAM, 1997.

C. L. Lawson, R. J. Hanson, D. Kincaid, and F. T. Krogh, Basic Linear Algebra Subprograms for Fortran Usage, ACM Transactions on Mathematical Software, vol.5, issue.3, pp.308-323, 1979.
DOI : 10.1145/355841.355847

J. J. Dongarra, J. Du-croz, I. S. Duff, and S. Hammarling, A set of level 3 basic linear algebra subprograms, ACM Transactions on Mathematical Software, vol.16, issue.1, pp.1-17, 1990.
DOI : 10.1145/77626.79170

C. Bischof and C. Van-loan, The WY Representation for Products of Householder Matrices, SIAM Journal on Scientific and Statistical Computing, vol.8, issue.1, pp.2-13, 1987.
DOI : 10.1137/0908009

R. Schreiber and C. Van-loan, A Storage-Efficient $WY$ Representation for Products of Householder Transformations, SIAM Journal on Scientific and Statistical Computing, vol.10, issue.1, pp.53-57, 1991.
DOI : 10.1137/0910005

B. C. Gunter and R. A. Van-de-geijn, Parallel out-of-core computation and updating of the QR factorization, ACM Transactions on Mathematical Software, vol.31, issue.1, pp.60-78, 2005.
DOI : 10.1145/1055531.1055534

E. Chan, E. S. Quintana-orti, G. G. Quintana-orti, and R. Van-de-geijn, Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures, Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures , SPAA '07, pp.116-125, 2007.
DOI : 10.1145/1248377.1248397

J. Demmel, L. Grigori, M. F. Hoemmen, and J. Langou, Communication-optimal Parallel and Sequential QR and LU Factorizations, SIAM Journal on Scientific Computing, vol.34, issue.1, 2008.
DOI : 10.1137/080731992
URL : https://hal.archives-ouvertes.fr/hal-00870930

J. J. Dongarra, I. S. Duff, D. C. Sorensen, H. A. Van, and . Vorst, Numerical Linear Algebra for High-Performance Computers, 1998.
DOI : 10.1137/1.9780898719611

Z. Chen, J. Dongarra, P. Luszczek, and K. Roche, Self-adapting software for numerical linear algebra and LAPACK for clusters, Parallel Computing, vol.29, issue.11-12, pp.1723-1743, 2003.
DOI : 10.1016/j.parco.2003.05.014

F. G. Gustavson, L. Karlsson, and B. Kågström, Distributed SBP Cholesky factorization algorithms with near-optimal scheduling, ACM Transactions on Mathematical Software, vol.36, issue.2, pp.1-25, 2009.
DOI : 10.1145/1499096.1499100