C. Bastoul, Code generation in the polyhedral model is easier than you think, Proc. of the 13th International Conference on Parallel Architectures and Compilation Techniques, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00017260

P. Belotti, Couenne: a users manual, 2009.

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, PLUTO: A Practical and Fully Automatic Polyhedral Program Optimization System, Proc. ACM SIGPLAN, 2008.

, Conference on Programming Language Design and Implementation

S. Coleman and K. S. Mckinley, Tile Size Selection Using Cache Organization and Data Layout, Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI '95), pp.279-290, 1995.

D. Crawford, F. Henry, and I. Schaefer, An introduction to coupled cluster theory for computational chemists, Reviews in computational chemistry, pp.33-136, 2000.

P. Feautrier, Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International journal of parallel programming, vol.21, pp.313-347, 1992.

K. Goto and . Robert-a-geijn, Anatomy of highperformance matrix multiplication, ACM Transactions on Mathematical Software (TOMS), vol.34, p.12, 2008.

. Tze-meng-low, D. Francisco, . Igual, M. Tyler, and E. Smith, Analytical modeling is enough for highperformance BLIS, ACM Transactions on Mathematical Software, vol.43, p.12, 2016.

A. Devin and . Matthews, High-performance tensor contraction without transposition, SIAM Journal on Scientific Computing, vol.40, pp.1-24, 2018.

L. Renganarayana and S. Rajopadhye, Positivity, Posynomials and Tile Size Selection, Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC '08), vol.55, 2008.

J. Shirako, K. Sharma, N. Fauzia, L. Pouchet, J. Ramanujam et al., Analytical bounds for optimal tile size selection, International Conference on Compiler Construction, pp.101-121, 2012.

M. Tyler, R. Smith, M. Van-de-geijn, J. R. Smelyanskiy, F. Hammond et al., Anatomy of highperformance many-threaded matrix multiplication, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp.1049-1059, 2014.

P. Springer and P. Bientinesi, Design of a High-Performance GEMM-like Tensor-Tensor Multiplication, CoRR, 2016.

P. Springer, T. Su, and P. Bientinesi, HPTT: A High-Performance Tensor Transposition C++ Library, Proceedings of the 4th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, pp.56-62, 2017.

R. Field-g-van-zee and . Van-de-geijn, BLIS: A framework for rapidly instantiating BLAS functionality, ACM Transactions on Mathematical Software (TOMS), vol.41, p.14, 2015.

S. Verdoolaege, J. C. Juega, and A. Cohen, José Ignacio Gómez, Christian Tenllado, and Francky Catthoor, ACM Trans. Archit. Code Optim, vol.9, issue.4, pp.1-54, 2013.

E. Wang, Q. Zhang, B. Shen, G. Zhang, X. Lu et al., Intel math kernel library, High-Performance Computing on the Intel® Xeon Phi, pp.167-188, 2014.

T. Yuki, L. Renganarayanan, S. Rajopadhye, C. Anderson, A. E. Eichenberger et al., Automatic Creation of Tile Size Selection Models, Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '10), pp.190-199, 2010.