D. Abrahams and A. Gurtovoy, C++ template metaprogramming: concepts, tools, and techniques from Boost and beyond, 2004.

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, pp.187-198, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00384363

M. Baboulin, Fast and reliable solutions for numerical linear algebra solvers in highperformance computing, 2012.
URL : https://hal.archives-ouvertes.fr/tel-00967523

M. Baboulin, D. Becker, and J. Dongarra, A Parallel Tiled Solver for Dense Symmetric Indefinite Systems on Multicore Architectures, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.14-24, 2012.
DOI : 10.1109/IPDPS.2012.12

URL : https://hal.archives-ouvertes.fr/inria-00631361

M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou et al., Accelerating scientific computations with mixed precision algorithms, Computer Physics Communications, vol.180, issue.12, pp.2526-2533, 2009.
DOI : 10.1016/j.cpc.2008.11.005

URL : http://arxiv.org/abs/0808.2794

M. Baboulin, S. Donfack, J. Dongarra, L. Grigori, A. Rémy et al., A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines, International Conference on Computational Science of Procedia Computer Science, pp.17-26, 2012.
DOI : 10.1016/j.procs.2012.04.003

URL : https://hal.archives-ouvertes.fr/hal-00656457

M. Baboulin, J. Dongarra, S. Gratton, and J. Langou, Computing the conditioning of the components of a linear least-squares solution. Numerical Linear Algebra with Applications, pp.517-533, 2009.

M. Baboulin, J. Dongarra, J. Herrmann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques, ACM Transactions on Mathematical Software, vol.39, issue.2, p.2013
DOI : 10.1145/2427023.2427025

URL : https://hal.archives-ouvertes.fr/inria-00593306

N. Bell and M. Garland, Cusp: Generic parallel algorithms for sparse matrix and graph computations

N. Bell and J. Hoberock, THRUST: a productivity-oriented library for CUDA, GPU Computing Gems, vol.7, 2011.
DOI : 10.1016/B978-0-12-811986-0.00033-9

A. Björck, Stability analysis of the method of semi-normal equations for least squares problems. Linear Algebra and its Applications, pp.31-48, 1987.

A. Björck, Numerical methods for least squares problems. Siam, 1996.

W. Bright, D language Templates revisited

M. I. Cole, Algorithmic Skeletons, 1989.
DOI : 10.1007/978-1-4471-0841-2_13

S. Conrad, Armadillo: An open source C++ linear algebra library for fast prototyping and computationally intensive experiments, 2010.

K. Czarnecki, U. Eisenecker, R. Glück, D. Vandevoorde, and T. Veldhuizen, Generative Programming and Active Libraries, Generic Programming, pp.25-39, 2000.
DOI : 10.1007/3-540-39953-4_3

K. Czarnecki and U. W. Eisenecker, Components and Generative Programming, Software Engineering?ESEC/FSE'99, pp.2-19, 1999.
DOI : 10.1007/3-540-48166-4_2

K. Czarnecki and U. W. Eisenecker, Generative Programming, 2000.
DOI : 10.1007/3-540-36208-8_2

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.387.6297

K. Czarnecki, K. Østerbye, and M. Völter, Generative Programming, Object-Oriented Technology ECOOP 2002 Workshop Reader, pp.15-29, 2002.
DOI : 10.1007/3-540-36208-8_2

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.387.6297

D. Demidov, VexCL: Vector expression template library for OpenCL, 2012.

J. Demmel, Y. Hida, W. Kahan, X. S. Li, S. Mukherjee et al., Error bounds from extra-precise iterative refinement, ACM Transactions on Mathematical Software, vol.32, issue.2, pp.325-351, 2006.
DOI : 10.1145/1141885.1141894

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.109.4101

J. Dongarra, Basic Linear Algebra Subprograms Technical Forum Standard, Int. J. of High Performance Computing Applications, vol.16, issue.1, 2002.
DOI : 10.1177/10943420020160010101

J. Enmyren and C. W. Kessler, SkePU, Proceedings of the fourth international workshop on High-level parallel programming and applications, HLPP '10, pp.5-14, 2010.
DOI : 10.1145/1863482.1863487

P. Esterie, J. Falcou, M. Gaunard, J. T. Lapresté, and L. Lacassagne, The numerical template toolbox: A modern C++ design for scientific computing, Journal of Parallel and Distributed Computing, vol.74, issue.12, 2014.
DOI : 10.1016/j.jpdc.2014.07.002

URL : https://hal.archives-ouvertes.fr/hal-01061305

J. Falcou, J. Sérot, L. Pech, and J. T. Lapresté, Meta-programming Applied to Automatic SMP Parallelization of Linear Algebra Code, Euro-Par 2008?Parallel Processing, pp.729-738, 2008.
DOI : 10.1007/978-3-540-85451-7_78

P. Gottschling, D. S. Wise, and M. D. Adams, Representation-transparent matrix algorithms with scalable performance, Proceedings of the 21st annual international conference on Supercomputing, ICS '07, pp.116-125, 2007.
DOI : 10.1145/1274971.1274989

B. J. Gough and R. M. Stallman, An Introduction to GCC, Network Theory Ltd, 2004.

D. Gregor, J. Järvi, J. Siek, B. Stroustrup, G. D. Reis et al., Concepts, ACM SIGPLAN Notices, vol.41, issue.10, pp.291-310, 2006.
DOI : 10.1145/1167515.1167499

G. Guennebaud and B. Jacob, Eigen v3, 2010.

N. J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, 2002.
DOI : 10.1137/1.9780898718027

K. Iglberger, G. Hager, J. Treibig, and U. Rüde, Expression Templates Revisited: A Performance Analysis of Current Methodologies, SIAM Journal on Scientific Computing, vol.34, issue.2, pp.42-69, 2012.
DOI : 10.1137/110830125

. Intel, Math Kernel Library (MKL) http://www.intel.com/software/products

P. Luszczek, J. Kurzak, and J. Dongarra, Looking back at dense linear algebra software, Journal of Parallel and Distributed Computing, vol.74, issue.7, 2013.
DOI : 10.1016/j.jpdc.2013.10.005

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.260.120

D. R. Musser, G. J. Derge, and A. Saini, STL tutorial and reference guide: C++ programming with the standard template library, 2009.

C. C. Paige and M. A. Saunders, LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares, ACM Transactions on Mathematical Software, vol.8, issue.1, pp.43-71, 1982.
DOI : 10.1145/355984.355989

D. Schmidt, Guest Editor's Introduction: Model-Driven Engineering, Computer, vol.39, issue.2, pp.25-31, 2006.
DOI : 10.1109/MC.2006.58

T. Sheard and S. P. Jones, Template meta-programming for Haskell, ACM SIGPLAN Notices, vol.37, issue.12, pp.60-75, 2002.
DOI : 10.1145/636517.636528

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.189.4479

W. Taha, A Gentle Introduction to Multi-stage Programming, Domain-Specific Program Generation, pp.30-50, 2004.
DOI : 10.1007/978-3-540-25935-0_3

S. Tomov, J. Dongarra, and M. Baboulin, Towards dense linear algebra for hybrid GPU accelerated manycore systems, Code Generation and Optimization, 2003. CGO 2003. International Symposium on, pp.232-240, 2003.
DOI : 10.1016/j.parco.2009.12.005

F. G. Van-zee, E. Chan, R. A. Van-de-geijn, E. S. Quintana-orti, and G. Quintana-orti, The libflame library for dense matrix computations, Computing in science & engineering, vol.11, issue.6, pp.56-63, 2009.

T. L. Veldhuizen and E. Gannon, Active libraries: Rethinking the roles of compilers and libraries, Proceedings of the SIAM Workshop on Object Oriented Methods for Interoperable Scientific and Engineering Computing (OO'98, 1998.

T. Veldhuizen, Expression templates, C++ Report, vol.7, pp.26-31, 1995.

J. Walter and M. Koch, The boost uBLAS library, 2002.

R. C. Whaley and J. Dongarra, Automatically Tuned Linear Algebra Software, Proceedings of the IEEE/ACM SC98 Conference, pp.1-27, 1998.
DOI : 10.1109/SC.1998.10004

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.3487