Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series, vol.180, issue.30, pp.12037-12066, 2009. ,
DOI : 10.1088/1742-6596/180/1/012037
URL : http://iopscience.iop.org/article/10.1088/1742-6596/180/1/012037/pdf
Parallel algebraic domain decomposition solver for the solution of augmented systems, Advances in Engineering Software, vol.60, issue.61, pp.23-30, 2013. ,
DOI : 10.1016/j.advengsoft.2012.07.004
URL : https://hal.archives-ouvertes.fr/hal-00719512
Robust Memory-Aware Mappings for Parallel Multifrontal Factorizations, SIAM Journal on Scientific Computing, vol.38, issue.3, p.2016 ,
DOI : 10.1137/130938505
URL : https://hal.archives-ouvertes.fr/hal-01334113
Adapting a parallel sparse direct solver to architectures with clusters of SMPs, Parallel Computing, vol.29, issue.11-12, pp.1645-1668, 2003. ,
DOI : 10.1016/j.parco.2003.05.010
An Approximate Minimum Degree Ordering Algorithm, SIAM Journal on Matrix Analysis and Applications, vol.17, issue.4, pp.886-905, 1996. ,
DOI : 10.1137/S0895479894278952
Multifrontal parallel distributed symmetric and unsymmetric solvers, Computer Methods in Applied Mechanics and Engineering, vol.184, issue.2-4, pp.501-520, 2000. ,
DOI : 10.1016/S0045-7825(99)00242-X
URL : https://hal.archives-ouvertes.fr/hal-00856651
Analysis, tuning and comparison of two general sparse solvers for distributed memory computers, ACM Trans. Math. Softw, vol.27, issue.4, pp.388-420, 2001. ,
DOI : 10.2172/776597
URL : https://hal.archives-ouvertes.fr/hal-00856654
Hybrid scheduling for the parallel solution of linear systems, Parallel Computing, vol.32, issue.2, pp.136-156, 2006. ,
DOI : 10.1016/j.parco.2005.07.004
URL : https://hal.archives-ouvertes.fr/hal-00358623
Improving Multifrontal Methods by Means of Block Low-Rank Representations, SIAM Journal on Scientific Computing, vol.37, issue.3, pp.1451-1474, 2015. ,
DOI : 10.1137/120903476
URL : https://hal.archives-ouvertes.fr/hal-01237169
On the Complexity of the Block Low-Rank Multifrontal Factorization, SIAM Journal on Scientific Computing, vol.39, issue.4, pp.2016-2019, 2016. ,
DOI : 10.1137/16M1077192
URL : https://hal.archives-ouvertes.fr/hal-01672943
Adapting a parallel sparse direct solver to architectures with clusters of smps Parallel and distributed scientific and engineering computing, Parallel Computing, vol.29, issue.11, pp.1645-1668, 2003. ,
A fast block low-rank dense solver with applications to finite-element matrices, Journal of Computational Physics, vol.304, pp.170-188, 2016. ,
DOI : 10.1016/j.jcp.2015.10.012
A fast, memory efficient and robust sparse preconditioner based on a multifrontal approach with applications to finite-element matrices, International Journal for Numerical Methods in Engineering, vol.1, issue.4, 2016. ,
DOI : 10.1002/nla.1680010405
A Block Low-Rank Multithreaded Factorization for Dense BEM Operators, SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP 2016), pp.72-78, 2016. ,
An efficient and transparent thread migration scheme in the PM2 runtime system, pp.496-510, 1999. ,
DOI : 10.1007/BFb0097934
URL : https://hal.archives-ouvertes.fr/inria-00073068
Exploring Thread and Memory Placement on NUMA Architectures: Solaris and Linux, UltraSPARC/FirePlane and Opteron/HyperTransport, HiPC, pp.338-352, 2006. ,
DOI : 10.1007/11945918_35
URL : http://cs.anu.edu.au/~Alistair.Rendell/papers/ThreadAndMemoryPlacement.Springer.pdf
The Traveling Salesman Problem: A Computational Study (Princeton Series in Applied Mathematics), 2007. ,
A Fan-In Algorithm for Distributed Sparse Numerical Factorization, SIAM Journal on Scientific and Statistical Computing, vol.11, issue.3, pp.593-599, 1990. ,
DOI : 10.1137/0911033
A comparison of three column based distributed sparse factorization schemes, Proc. Fifth SIAM Conf. on Parallel Processing for Scientific Computing, 1991. ,
DOI : 10.21236/ADA228143
The Fan-Both Family of Column-Based Distributed Cholesky Factorization Algorithms, pp.159-190, 1993. ,
DOI : 10.1007/978-1-4613-8369-7_8
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, pp.30-31, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation, Journal of Computational Physics, vol.230, issue.5, pp.2004-2020, 2011. ,
DOI : 10.1016/j.jcp.2010.11.047
URL : https://hal.archives-ouvertes.fr/hal-00547406
Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp.30-31, 2011. ,
DOI : 10.1109/IPDPS.2011.299
DAGuE: A generic distributed DAG engine for High Performance Computing, Parallel Computing, vol.38, issue.12, p.2012 ,
Athapascan runtime: Efficiency for irregular problems, pp.591-600, 1997. ,
DOI : 10.1007/BFb0002788
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.180-186, 2010. ,
DOI : 10.1109/PDP.2010.67
URL : https://hal.archives-ouvertes.fr/inria-00429889
Fine-grained multithreading for the multifrontal QR factorization of sparse matrices, 2013. To appear in SIAM SISC and APO technical report number RT-APO-11-6 ,
The Impact of Multicore on Math Software, Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing, PARA'06, pp.1-10, 2007. ,
DOI : 10.1007/978-3-540-75755-9_1
A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009. ,
DOI : 10.1016/j.parco.2008.10.002
Incomplete LU factorization: A multifrontal approach, 1995. ,
Optimizations of hybrid sparse linear solvers relying on Schur complement and domain decomposition approaches, p.43, 2015. ,
URL : https://hal.archives-ouvertes.fr/tel-01228520
An improved recursive graph bipartitioning algorithm for well balanced domain decomposition, 2014 21st International Conference on High Performance Computing (HiPC), pp.1-10, 2014. ,
DOI : 10.1109/HiPC.2014.7116878
URL : https://hal.archives-ouvertes.fr/hal-01056749
An Efficient Solver for Sparse Linear Systems Based on Rank- Structured Cholesky Factorization, 2015. ,
A framework for block ilu factorizations using block-size reduction, Mathematics of Computation, vol.64, issue.209, pp.129-156, 1995. ,
Conception d'un solveur haute performance de systèmes linéaires creux couplant des méthodes multigrilles et directes pour la résolution des équations de Maxwell 3D en régime harmonique discrétisées par éléments finis, 2009. ,
A Parallel Full Geometric Multigrid Solver for Time Harmonic Maxwell Problems, SIAM Journal on Scientific Computing, vol.36, issue.2, pp.119-138, 2014. ,
DOI : 10.1137/130909512
URL : https://hal.archives-ouvertes.fr/hal-00847966
Deflated and augmented krylov subspace techniques. Numerical Linear Algebra with Applications, pp.43-66, 1997. ,
DOI : 10.1002/(sici)1099-1506(199701/02)4:1<43::aid-nla99>3.0.co;2-z
URL : ftp://ftp.cs.umn.edu/dept/users/saad/reports/FILES/umsi-95-181.ps.gz
Algorithmic study and complexity bounds for a nested dissection solver, Numerische Mathematik, vol.8, issue.4, pp.463-476, 1989. ,
DOI : 10.1007/BF01396049
Compact DAG representation and its symbolic scheduling, Journal of Parallel and Distributed Computing, vol.64, issue.8, pp.921-935, 2004. ,
DOI : 10.1016/j.jpdc.2004.05.001
URL : https://hal.archives-ouvertes.fr/inria-00099958
Direct Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics, 2006. ,
DOI : 10.1137/1.9780898718881
Multifrontal sparse qr factorization: Multicore, and gpu work in progress, 15th SIAM Conference on Parallel Processing for Scientific Computing, 2012. ,
Algorithm 915, SuiteSparseQR, ACM Transactions on Mathematical Software, vol.38, issue.1, 2011. ,
DOI : 10.1145/2049662.2049670
The university of Florida sparse matrix collection, ACM Transactions on Mathematical Software, vol.38, issue.1, pp.1-1, 2011. ,
DOI : 10.1145/2049662.2049663
A survey of direct methods for sparse linear systems, Acta Numerica, vol.1, issue.2, pp.383-566, 2016. ,
DOI : 10.1137/0907081
Shape-optimized mesh partitioning and load balancing for parallel adaptive FEM, Parallel Computing, vol.26, issue.12, pp.1555-1581, 2000. ,
DOI : 10.1016/S0167-8191(00)00043-0
Direct methods for sparse matrices, 1986. ,
DOI : 10.1093/acprof:oso/9780198508380.001.0001
The Multifrontal Solution of Indefinite Sparse Symmetric Linear, ACM Transactions on Mathematical Software, vol.9, issue.3, pp.302-325, 1983. ,
DOI : 10.1145/356044.356047
Ordonnancement hybride statique-dynamique en algèbre linéaire creuse pour de grands clusters de machines NUMA et multi-coeurs, 2009. ,
Dynamic Scheduling for sparse direct Solver on NUMA architectures, PARA'08, pp.19-29, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00344026
Conception d'un solveur linéaire creux parallèle hybride direct-itératif, 2009. ,
A Parallel Direct/Iterative Solver Based on a Schur Complement Approach, 2008 11th IEEE International Conference on Computational Science and Engineering, pp.98-105, 2008. ,
DOI : 10.1109/CSE.2008.36
URL : https://hal.archives-ouvertes.fr/hal-00353589
Nested Dissection of a Regular Finite Element Mesh, SIAM Journal on Numerical Analysis, vol.10, issue.2, pp.345-363, 1973. ,
DOI : 10.1137/0710032
Sparse Cholesky Factorization on a Local-Memory Multiprocessor, SIAM Journal on Scientific and Statistical Computing, vol.9, issue.2, pp.327-340, 1988. ,
DOI : 10.1137/0909021
Computer Solution of Large Sparse Positive Definite Systems, p.7, 1981. ,
On the Application of the Minimum Degree Algorithm to Finite Element Systems, SIAM Journal on Numerical Analysis, vol.15, issue.1, pp.90-112, 1978. ,
DOI : 10.1137/0715006
Multifrontal Factorization of Sparse SPD Matrices on GPUs, 2011 IEEE International Parallel & Distributed Processing Symposium, pp.372-383, 2011. ,
DOI : 10.1109/IPDPS.2011.44
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling, SIAM Journal on Scientific Computing, vol.38, issue.5, pp.358-384, 2016. ,
DOI : 10.1137/15M1010117
Sparse approximations of the Schur complement for parallel algebraic hybrid linear solvers in 3D, p.53, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00466828
Outils numériques parallèles pour la résolution de très grands problèmes d'électromagnétisme, Séminaire sur l'Algorithmique Numérique Appliquée aux Problèmes Industriels, 2003. ,
Parallel sparse linear algebra and application to structural mechanics. Numerical Algorithms, pp.371-391, 2000. ,
URL : https://hal.archives-ouvertes.fr/inria-00346016
Parallel black box $$\mathcal {H}$$ -LU preconditioning for elliptic boundary value problems, Computing and Visualization in Science, vol.40, issue.1, pp.273-291, 2008. ,
DOI : 10.1007/s00791-008-0098-9
Performance Of H-Lu Preconditioning For Sparse Matrices, Computational methods in applied mathematics, pp.336-349, 2008. ,
DOI : 10.2478/cmam-2008-0024
Parallel Symbolic Factorization for Sparse LU with Static Pivoting, SIAM Journal on Scientific Computing, vol.29, issue.3, pp.1289-1314, 2007. ,
DOI : 10.1137/050638102
Recent Progress in General Sparse Direct Solvers, In Lecture Notes in Computer Science, vol.2073, pp.823-840, 2001. ,
DOI : 10.1007/3-540-45545-0_94
Hierarchical Matrices: Algorithms and Analysis, Series in Computational Mathematics, pp.72-83, 2015. ,
DOI : 10.1007/978-3-662-47324-5
URL : https://link.springer.com/content/pdf/bfm%3A978-3-662-47324-5%2F1.pdf
Error Detecting and Error Correcting Codes, Bell System Technical Journal, vol.29, issue.2, pp.147-160, 1950. ,
DOI : 10.1002/j.1538-7305.1950.tb00463.x
URL : https://calhoun.nps.edu/bitstream/10945/46756/1/Hamming_1982.pdf
Parallel Algorithms for Sparse Linear Systems, SIAM Review, vol.33, issue.3, pp.420-460, 1991. ,
DOI : 10.1137/1033099
A multi-level algorithm for partitioning graphs, Supercomputing Proceedings of the IEEE/ACM SC95 Conference, pp.28-28, 1995. ,
Improving the Run Time and Quality of Nested Dissection Ordering, SIAM Journal on Scientific Computing, vol.20, issue.2, pp.468-489, 1998. ,
DOI : 10.1137/S1064827596300656
Graph partitioning models for parallel computing, Parallel Computing, vol.26, issue.12, pp.1519-1534, 2000. ,
DOI : 10.1016/S0167-8191(00)00048-X
An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations, SIAM Journal on Scientific Computing, vol.16, issue.2, pp.452-469, 1995. ,
DOI : 10.1137/0916028
Distribution des Données et Régulation Statique des Calculs et des Communications pour la Résolution de Grands Systèmes Linéaires Creux par Méthode Directe, pp.11-14, 2001. ,
A Mapping and Scheduling Algorithm for Parallel Sparse Fan-In Numerical Factorization???, Proceedings of EuroPAR'99, number 1685 in Lecture Notes in Computer Science, pp.1059-1067, 1999. ,
DOI : 10.1007/3-540-48311-X_148
PaStiX: A Parallel Sparse Direct Solver Based on a Static Scheduling for Mixed 1D/2D Block Distributions, Proceedings of Irregular'2000, number 1800 in Lecture Notes in Computer Science, pp.519-525, 2000. ,
DOI : 10.1007/3-540-45591-4_70
PaStiX: a high-performance parallel direct solver for sparse symmetric positive definite systems, Parallel Computing, vol.28, issue.2, pp.301-321, 2002. ,
DOI : 10.1016/S0167-8191(01)00141-7
Efficient algorithms for direct resolution of large sparse system on clusters of SMP nodes, SIAM Conference on Applied Linear Algebra, p.8, 2003. ,
On finding approximate supernodes for an efficient block-ILU(k) factorization, Parallel Computing, vol.34, issue.6-8, pp.345-362, 2008. ,
DOI : 10.1016/j.parco.2007.12.003
A Parallel Multistage ILU Factorization Based on a Hierarchical Graph Decomposition, SIAM Journal on Scientific Computing, vol.28, issue.6, pp.2266-2293, 2006. ,
DOI : 10.1137/040608258
Hierarchical interpolative factorization for elliptic operators: differential equations, Communications on Pure and Applied Mathematics, 2015. ,
Design of a Multicore Sparse Cholesky Factorization Using DAGs, SIAM Journal on Scientific Computing, vol.32, issue.6, pp.3627-3649, 2010. ,
DOI : 10.1137/090757216
Emiel Van Der Plas, and Pierre Ramet. Non-linear MHD simulations of edge localized modes (ELMs), Plasma Physics, vol.51, issue.12, p.124012, 2009. ,
A Scalable Parallel Algorithm for Incomplete Factor Preconditioning, SIAM Journal on Scientific Computing, vol.22, issue.6, pp.2194-2215, 2001. ,
DOI : 10.1137/S1064827500376193
The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations, Journal of Parallel and Distributed Computing, vol.72, issue.9, pp.1134-1143, 2012. ,
DOI : 10.1016/j.jpdc.2011.10.014
The Traveling Salesman Problem: A Case Study in Local Optimization, pp.215-310, 1997. ,
PSPASES : Scalable Parallel Direct Solver Library for Sparse Symmetric Positive Definite Linear Systems, 1999. ,
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs, SIAM Journal on Scientific Computing, vol.20, issue.1, pp.359-392, 1998. ,
DOI : 10.1137/S1064827595287997
MeTiS ? A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices ? Version 4.0, 1998. ,
Parallel multilevel k-way partitioning scheme for irregular graphs, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '96, pp.96-129, 1998. ,
DOI : 10.1145/369028.369103
URL : http://www.cs.umn.edu/~kumar/papers/mlevel_kparallel.ps
Parallel threshold-based ILU factorization, Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '97, pp.1-24, 1997. ,
DOI : 10.1145/509593.509621
Scalable sparse tensor decompositions in distributed memory systems, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '15, pp.1-77, 2015. ,
DOI : 10.1145/2807591.2807624
URL : https://hal.archives-ouvertes.fr/hal-01148202
High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors, 2016 45th International Conference on Parallel Processing (ICPP), pp.103-112, 2016. ,
DOI : 10.1109/ICPP.2016.19
URL : https://hal.archives-ouvertes.fr/hal-01354894
Experiences in scaling scientific applications on current-generation quad-core processors, 2008 IEEE International Symposium on Parallel and Distributed Processing, 2008. ,
DOI : 10.1109/IPDPS.2008.4536342
$${{\fancyscript{H}}} $$ H -LU factorization on many-core systems, Computing and Visualization in Science, vol.35, issue.2, pp.105-117, 2013. ,
DOI : 10.1145/1365490.1365500
Autotuning GEMM Kernels for the Fermi GPU, IEEE Transactions on Parallel and Distributed Systems, vol.23, issue.11, pp.2045-2057, 2012. ,
DOI : 10.1109/TPDS.2011.311
URL : http://www.netlib.org/utk/people/JackDongarra/PAPERS/auto-fermi-2012.pdf
Scheduling and memory optimizations for sparse direct solver on multi-core/multi-gpu cluster systems, p.5, 2015. ,
Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, pp.29-38, 2014. ,
DOI : 10.1109/IPDPSW.2014.9
URL : https://hal.archives-ouvertes.fr/hal-00987094
Méthode de décomposition de domaine pour les équations du transport simplifié en neutronique, 2010. ,
Evaluation of Sparse LU Factorization and Triangular Solution on Multicore Platforms, Lecture Notes in Computer Science, vol.29, issue.2, pp.287-300, 2008. ,
DOI : 10.1145/1362622.1362674
A scalable sparse direct solver using static pivoting, Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, pp.8-43, 1999. ,
Generalized Nested Dissection, SIAM Journal on Numerical Analysis, vol.16, issue.2, pp.9-53, 1979. ,
DOI : 10.1137/0716027
A Separator Theorem for Planar Graphs, SIAM Journal on Applied Mathematics, vol.36, issue.2, pp.177-189, 1979. ,
DOI : 10.1137/0136016
URL : http://historical.ncstrl.org/litesite-data/stan/CS-TR-77-627.pdf
Modification of the minimum-degree algorithm by multiple elimination, ACM Transactions on Mathematical Software, vol.11, issue.2, pp.141-153, 1985. ,
DOI : 10.1145/214392.214398
The Role of Elimination Trees in Sparse Factorization, SIAM Journal on Matrix Analysis and Applications, vol.11, issue.1, pp.134-172, 1990. ,
DOI : 10.1137/0611010
On finding supernodes for sparse matrix computations, SIAM Journal on Matrix Analysis and Applications, vol.14, issue.1, pp.242-252, 1993. ,
Multifrontal Computations on GPUs and Their Multi-core Hosts, Proceedings of the 9th international conference on High performance computing for computational science, VECPAR'10, pp.71-82, 2011. ,
DOI : 10.1016/0167-8191(86)90019-0
Scheduling of algorithms based on elimination trees on NUMA systems, EuroPar'99, pp.1068-1072, 1999. ,
A new diffusion-based multilevel algorithm for computing graph partitions, Journal of Parallel and Distributed Computing, vol.69, issue.9, pp.750-761, 2009. ,
DOI : 10.1016/j.jpdc.2009.04.005
A unified geometric approach to graph separators, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science, pp.538-547, 1991. ,
DOI : 10.1109/SFCS.1991.185417
Density graphs and separators, Second Annual ACM-SIAM Symposium on Discrete Algorithms, pp.331-336, 1991. ,
A generalized domain decomposition paradigm for parallel incomplete LU factorization preconditionings, High Performance Computing and Networking, pp.925-932, 2001. ,
DOI : 10.1016/S0167-739X(01)00034-6
Massively Parallel Cartesian Discrete Ordinates Method for Neutron Transport Simulation, 2015. ,
URL : https://hal.archives-ouvertes.fr/tel-01379686
Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver, Annals of Nuclear Energy, pp.1-10, 2014. ,
DOI : 10.1016/j.anucene.2014.08.034
URL : https://hal.archives-ouvertes.fr/hal-00986975
An Improved Magma Gemm For Fermi Graphics Processing Units, The International Journal of High Performance Computing Applications, vol.24, issue.4, pp.511-515, 2010. ,
DOI : 10.1016/S0167-8191(00)00087-9
Cublas library. NVIDIA Corporation, 2008. ,
Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs, High-Performance Computing and Networking, pp.493-498, 1996. ,
DOI : 10.1007/3-540-61142-8_588
Sparse matrix ordering with Scotch, Proceedings of HPCN'97, pp.370-378, 1997. ,
DOI : 10.1007/BFb0031609
Hybridizing nested dissection and halo approximate minimum degree for efficient sparse matrix ordering. Concurrency: Practice and Experience, Preliminary version published in Proceedings of Irregular'99, pp.69-84, 2000. ,
DOI : 10.1002/(sici)1096-9128(200002/03)12:2/3<69::aid-cpe472>3.0.co;2-w
URL : http://www.enseeiht.fr/Recherche/Info/Numerique/Noail/Membres/../MUMPS/hamd_cpe.ps.gz
Scotch and libScotch 5.1 User's Guide, pp.10-78, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00410332
Sparse Supernodal Solver Using Block Low-Rank Compression, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp.1138-1147, 2017. ,
DOI : 10.1109/IPDPSW.2017.86
URL : https://hal.archives-ouvertes.fr/hal-01450732
Reordering Strategy for Blocking Optimization in Sparse Linear Solvers, SIAM Journal on Matrix Analysis and Applications, vol.38, issue.1, pp.226-248, 2017. ,
DOI : 10.1137/16M1062454
URL : https://hal.archives-ouvertes.fr/hal-01485507
A Mapping Algorithm for Parallel Sparse Cholesky Factorization, SIAM Journal on Scientific Computing, vol.14, issue.5, pp.1253-1257, 1993. ,
DOI : 10.1137/0914074
Fast hierarchical solvers for sparse matrices using extended sparsification and low-rank approximation. arXiv preprint ,
DOI : 10.1137/15m1046939
URL : http://arxiv.org/pdf/1510.07363
Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3, 2009. ,
DOI : 10.1145/1527286.1527288
A latency tolerant hybrid sparse solver using incomplete cholesky factorization. Numerical Linear Algebra with Applications, pp.541-560, 2003. ,
DOI : 10.1002/nla.327
ShyLU: A Hybrid-Hybrid Solver for Multicore Platforms, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.631-643, 2012. ,
DOI : 10.1109/IPDPS.2012.64
Algorithmic Aspects of Vertex Elimination on Graphs, SIAM Journal on Computing, vol.5, issue.2, pp.266-283, 1976. ,
DOI : 10.1137/0205021
An Efficient Block-Oriented Approach to Parallel Sparse Cholesky Factorization, SIAM Journal on Scientific Computing, vol.15, issue.6, pp.1413-1439, 1994. ,
DOI : 10.1137/0915085
ILUT: A dual threshold incomplete LU factorization, Numerical Linear Algebra with Applications, vol.19, issue.4, pp.387-402, 1994. ,
DOI : 10.2118/8252-PA
URL : ftp://ftp.cs.umn.edu/dept/users/saad/reports/FILES/umsi-92-38.ps.gz
Iterative Methods for Sparse Linear Systems, Society for Industrial and Applied Mathematics, pp.43-44, 2003. ,
DOI : 10.1137/1.9780898718003
Solving unsymmetric sparse systems of linear equations with PARDISO, Future Generation Computer Systems, vol.20, issue.3, pp.475-487, 2004. ,
DOI : 10.1016/j.future.2003.07.011
Scaling the solution of large sparse linear systems using multifrontal methods on hybrid shared-distributed memory architectures, 2014. ,
URL : https://hal.archives-ouvertes.fr/tel-01111259
Accelerating the Tucker Decomposition with Compressed Sparse Tensors, European Conference on Parallel Processing, 2017. ,
DOI : 10.1145/2783258.2783395
Sparse Tensor Factorization on Many-Core Processors with High-Bandwidth Memory, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017. ,
DOI : 10.1109/IPDPS.2017.84
A collection of Fortran codes for large scale scientific computation ,
Compress and eliminate " solver for symmetric positive definite sparse matrices ,
Fast implementation of DGEMM on Fermi GPU, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '11, pp.1-35, 2011. ,
DOI : 10.1145/2063384.2063431
Test Suite for Evaluating Performance of MPI Implementations That Support MPI_THREAD_MULTIPLE, In EuroPVM/MPI Lecture Notes in Computer Science, vol.4757, pp.46-55, 2007. ,
DOI : 10.1007/978-3-540-75416-9_13
Improving Reactivity and Communication Overlap in MPI Using a Generic I/O Manager, EuroPVM/MPI, pp.170-177, 2007. ,
DOI : 10.1007/978-3-540-75416-9_27
URL : https://hal.archives-ouvertes.fr/inria-00177167
The LibFlame library for dense matrix computations, Computing in science & engineering, vol.11, issue.6, pp.56-63, 2009. ,
Benchmarking GPUs to tune dense linear algebra, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.31-32, 2008. ,
DOI : 10.1109/SC.2008.5214359
URL : http://bebop.cs.berkeley.edu/pubs/volkov2008-benchmarking.pdf
A Parallel Geometric Multifrontal Solver Using Hierarchically Semiseparable Structure, ACM Transactions on Mathematical Software, vol.42, issue.3, pp.1-21, 2016. ,
DOI : 10.1002/nla.691
A conjugate gradient-truncated direct method for the iterative solution of the reservoir simulation pressure equation, 1981. ,
Superfast Multifrontal Method for Large Structured Linear Systems of Equations, SIAM Journal on Matrix Analysis and Applications, vol.31, issue.3, pp.1382-1411, 2010. ,
DOI : 10.1137/09074543X
On Techniques to Improve Robustness and Scalability of a Parallel Hybrid Linear Solver, High Performance Computing for Computational Science ? VECPAR 2010, pp.421-434, 2011. ,
DOI : 10.1145/779359.779361
One-sided Dense Matrix Factorizations on a Multicore with Multiple GPU Accelerators*, Procedia Computer Science, vol.9, issue.Complete, pp.37-46, 2012. ,
DOI : 10.1016/j.procs.2012.04.005
URL : https://doi.org/10.1016/j.procs.2012.04.005
Sparse hierarchical solvers with guaranteed convergence . arXiv preprint ,
Dynamic Task Execution on Shared and Distributed Memory Architectures ,
A CPU???GPU hybrid approach for the unsymmetric multifrontal method, Parallel Computing, vol.37, issue.12, pp.759-770, 2011. ,
DOI : 10.1016/j.parco.2011.09.002