Ä. Björck, Numerical Methods in Matrix Computations, 2015.

A. Athanasios, . Konstantinidis, H. J. Paul, and . Kelly, More Definite Results from the PluTo Scheduling Algorithm, 1st International Workshop on Polyhedral Compilation Techniques (IMPACT, 2011.

T. Baroudi, R. Seghir, and V. Loechner, Optimization of Triangular and Banded Matrix Operations Using 2d-Packed Layouts, ACM Trans. Archit. Code Optim, vol.14, p.55, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01633724

N. Bell and M. Garland, Efficient Sparse Matrix-Vector Multiplication on CUDA, 2008.

U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev et al., Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model, International Conference on Compiler Construction, 2008.

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A Practical Automatic Polyhedral Program Optimization System, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2008.

A. Buluc, R. John, and . Gilbert, Challenges and Advances in Parallel Sparse Matrix-Matrix Multiplication, Proceedings of the 2008 37th International Conference on Parallel Processing (ICPP '08), pp.503-510, 2008.

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures, Parallel Comput, vol.35, pp.38-53, 2009.
URL : https://hal.archives-ouvertes.fr/hal-02420965

H. Cui, J. Xue, L. Wang, and Y. Yang, Extendable Pattern-oriented Optimization Directives, ACM Trans. Archit. Code Optim, vol.9, p.14, 2012.

H. Cui and Q. Yi, Layoutoblivious Compiler Optimization for Matrix Computations, ACM Trans. Archit. Code Optim, vol.9, p.35, 2013.

J. R. Gilbert, S. Reinhardt, and V. B. Shah, Highperformance Graph Algorithms from Parallel Sparse Matrices, Proceedings of the 8th International Conference on Applied Parallel Computing: State of the Art in Scientific Computing (PARA'07), pp.260-269, 2007.

F. G. Gustavson, J. Wa?niewski, J. J. Dongarra, and J. Langou, Rectangular Full Packed Format for Cholesky's Algorithm: Factorization, Solution, and Inversion, ACM Trans. Math. Softw, vol.37, issue.18, 2010.

Z. Hu, J. Del-cuvillo, W. Zhu, R. Guang, and . Gao, Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences, pp.134-144, 2006.

G. M. Megson and X. Chen, Automatic Parallelization for a Class of Regular Computations, 1997.

Y. Nagasaka, A. Nukada, and S. Matsuoka, Adaptive Multi-level Blocking Optimization for Sparse Matrix Vector Multiplication on GPU, Procedia Computer Science, vol.80, pp.131-142, 2016.

L. N. Pouchet, Polybench/c 4.1: The polyhedral benchmark suite, 2015.

Y. Saad, Iterative Methods for Sparse Linear Systems, 2003.

J. Shirako and V. Sarkar, Integrating Data Layout Transformations with the Polyhedral Model, Proceedings of International Workshop on Polyhedral Compilation Techniques (IMPACT'19), 2019.

V. Volkov and J. W. Demmel, Benchmarking GPUs to Tune Dense Linear Algebra, Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC '08), vol.11, p.31, 2008.