J. Aasen, On the reduction of a symmetric matrix to tridiagonal form, BIT, vol.10, issue.3, pp.233-242, 1971.
DOI : 10.1007/BF01931804

M. Abalenkovs, A. Abdelfattah, J. Dongarra, M. Gates, A. Haidar et al., Parallel programming models for dense linear algebra on heterogeneous systems, Supercomputing Frontiers and Innovations, vol.2, issue.4, pp.10-2015, 2015.

A. Abdelfattah, A. Haidar, S. Tomov, and J. Dongarra, Performance, Design, and Autotuning of Batched GEMM for GPUs. The International Supercomputing Conference (ISC High Performance, pp.6-2016, 2016.

C. Ashcraft, R. G. Grimes, and J. G. Lewis, Accurate Symmetric Indefinite Linear Equation Solvers, SIAM Journal on Matrix Analysis and Applications, vol.20, issue.2, pp.513-561, 1998.
DOI : 10.1137/S0895479896296921

M. Baboulin, J. J. Dongarra, J. Hermann, and S. Tomov, Accelerating Linear System Solutions Using Randomization Techniques, ACM Transactions on Mathematical Software, vol.39, issue.2, p.2013
DOI : 10.1145/2427023.2427025

URL : https://hal.archives-ouvertes.fr/inria-00593306

M. Baboulin, D. Becker, G. Bosilca, A. Danalis, and J. J. Dongarra, An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems, Parallel Computing, vol.40, issue.7, pp.213-223, 2014.
DOI : 10.1016/j.parco.2013.12.003

URL : https://hal.archives-ouvertes.fr/hal-01024857

M. Baboulin, X. S. Li, and F. Rouet, Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods, Proceedings of International Conference on Vector and Parallel Processing Eugene (OR), 2014.
DOI : 10.1007/978-3-319-17353-5_12

URL : https://hal.archives-ouvertes.fr/hal-01205703

G. Ballard, D. Becker, J. Demmel, J. Dongarra, A. Druinsky et al., Communication-Avoiding Symmetric-Indefinite Factorization, SIAM Journal on Matrix Analysis and Applications, vol.35, issue.4, pp.35-1364, 2014.
DOI : 10.1137/130929060

URL : http://nma.berkeley.edu/ark:/28722/bk00148291v

J. R. Bunch and B. N. Parlett, Direct Methods for Solving Symmetric Indefinite Systems of Linear Equations, SIAM Journal on Numerical Analysis, vol.8, issue.4, pp.639-655, 1971.
DOI : 10.1137/0708060

J. R. Bunch and L. Kaufman, Some stable methods for calculating inertia and solving symmetric linear systems, Mathematics of Computation, vol.31, issue.137, pp.163-179, 1977.
DOI : 10.1090/S0025-5718-1977-0428694-0

URL : http://www.ams.org/mcom/1977-31-137/S0025-5718-1977-0428694-0/S0025-5718-1977-0428694-0.pdf

G. Ballard, D. Becker, J. Demmel, J. Dongarra, A. Druinsky et al., Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp.895-907, 2013.
DOI : 10.1109/IPDPS.2013.98

URL : http://www.cs.berkeley.edu/%7Eodedsc/papers/block_aasen.pdf

I. Yamazaki, S. Tomov, and J. Dongarra, Non-GPU-resident Dense Symmetric Indefinite Factorization, Concurrency and Computation: Practice and Experience, 2016.
DOI : 10.1002/cpe.4012

D. Becker, M. Baboulin, and J. J. Dongarra, Reducing the Amount of Pivoting in Symmetric Indefinite Systems, Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics, pp.133-142, 2011.
DOI : 10.1007/978-3-642-31464-3_14

URL : https://hal.archives-ouvertes.fr/inria-00593694

?. A. Björck, Numerical Methods for Least Squares Problems, 1996.
DOI : 10.1137/1.9781611971484

A. Buttari, J. Dongarra, J. Langou, J. Langou, P. Luszczek et al., Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems, The International Journal of High Performance Computing Applications, vol.1, issue.4, pp.457-466, 2007.
DOI : 10.1137/1.9780898718058

M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou et al., Accelerating scientific computations with mixed precision algorithms, Computer Physics Communications, vol.180, issue.12, pp.2526-2533, 2009.
DOI : 10.1016/j.cpc.2008.11.005

URL : http://arxiv.org/pdf/0808.2794

A. Castaldo and R. Whaley, Scaling LAPACK panel operations using parallel cache assignment, Proceedings of the 15th AGM SIGPLAN symposium on principle and practice of parallel programming, pp.223-232, 2010.

J. Demmel, L. Grigori, M. Hoemmen, and J. Langou, Communicationoptimal parallel and sequential QR and LU factorizations, A206?A239, also available as EECS Department, 2012.
DOI : 10.1137/080731992

URL : https://hal.archives-ouvertes.fr/hal-00870930

L. Grigori, J. Demmel, and H. Xiang, CALU: A Communication Optimal LU Factorization Algorithm, SIAM Journal on Matrix Analysis and Applications, vol.32, issue.4, pp.1317-1350, 2011.
DOI : 10.1137/100788926

URL : https://hal.archives-ouvertes.fr/hal-00651137

F. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development, vol.41, issue.6, pp.737-755, 1997.
DOI : 10.1147/rd.416.0737

N. J. Higham, Accuracy and stability of numerical algorithms, SIAM, 2002.
DOI : 10.1137/1.9780898718027

URL : http://eprints.maths.manchester.ac.uk/238/4/asna2_cover.pdf

J. Nédélec, Acoustic and electromagnetic equations Integral representations for harmonic problems, Appl. Math. Sci, vol.144, 2001.

D. S. Parker, Random Butterfly Transformations with Applications in Computational Linear Algebra, 1995.

M. Rozlo?-zník, G. Shklarski, and S. Toledo, Partitioned Triangular Tridiagonalization, ACM Transactions on Mathematical Software, vol.37, issue.4, pp.1-16, 2011.
DOI : 10.1145/1916461.1916462

M. Baboulin, J. Dongarra, J. Demmel, S. Tomov, and V. Volkov, Enhancing the performance of dense linear algebra solvers on GPUs in the MAGMA project, Poster at Supercomputing (SC'08), 2008.

S. Tomov, J. Dongarra, and M. Baboulin, Towards dense linear algebra for hybrid GPU accelerated manycore systems, Parallel Computing, vol.36, issue.5-6, pp.232-240, 2010.
DOI : 10.1016/j.parco.2009.12.005

URL : http://icl.cs.utk.edu/news_pub/submissions/tdb.pdf

S. Toledo, Locality of Reference in LU Decomposition with Partial Pivoting, SIAM Journal on Matrix Analysis and Applications, vol.18, issue.4, pp.1065-1081, 1997.
DOI : 10.1137/S0895479896297744

J. H. Wilkinson, Rounding Errors in Algebraic Processes, 1963.

C. B. Moler, Iterative Refinement in Floating Point, Journal of the ACM, vol.14, issue.2, pp.316-321, 1967.
DOI : 10.1145/321386.321394

J. Dongarra, J. Kurzak, P. Luszczek, T. Moore, and S. Tomov, Numerical algorithms and libraries at exascale. http://www.hpcwire.com, 2015.

R. Nath, S. Tomov, and J. Dongarra, An Improved Magma Gemm For Fermi Graphics Processing Units, The International Journal of High Performance Computing Applications, vol.24, issue.4, pp.511-515, 2010.
DOI : 10.1016/S0167-8191(00)00087-9

Y. Yan, B. M. Chapman, and M. Wong, A comparison of heterogeneous and manycore programming models

J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek et al., Accelerating Numerical Dense Linear Algebra Calculations with GPUs, Numerical Computations with GPUs, pp.1-26, 2014.
DOI : 10.1007/978-3-319-06548-9_1

A. Haidar, C. Cao, I. Yamazaki, J. Dongarra, M. Gates et al., Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads Across Accelerators, Coprocessors , and Multicore Processors, 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA 14), pp.11-2014, 2014.

A. Haidar, J. Dongarra, K. Kabir, M. Gates, P. Luszczek et al., HPC programming on Intel many-integrated-core hardware with MAGMA port to Xeon Phi, Scientific Programming, pp.1-2015, 2015.