B. W. Bader and T. G. Kolda, Algorithm 862: MATLAB tensor classes for fast algorithm prototyping, ACM TOMS, vol.32, issue.4, pp.635-653, 2006.

G. Ballard, N. Knight, and K. Rouse, Communication lower bounds for matricized tensor times Khatri-Rao product, IPDPS, pp.557-567, 2018.

A. Y. Grama, A. Gupta, and V. Kumar, Isoefficiency: Measuring the scalability of parallel algorithms and architectures, IEEE Parallel & Distributed Technology: Systems & Applications, vol.1, issue.3, pp.12-21, 1993.

F. Kjolstad, S. Kamil, S. Chou, D. Lugato, and S. Amarasinghe, The Tensor Algebra Compiler, Proc. ACM Program. Lang, vol.1, pp.2475-1421, 2017.

T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM Review, vol.51, issue.3, pp.455-500, 2009.

J. Li, C. Battaglino, I. Perros, J. Sun, and R. Vuduc, An input-adaptive and in-place approach to dense tensor-times-matrix multiply, SC'15, vol.76, p.12, 2015.

D. Matthews, High-performance tensor contraction without transposition, SIAM Journal on Scientific Computing, vol.40, issue.1, pp.1-24, 2018.

G. M. Morton, A computer oriented geodetic data base and a new technique in file sequencing, 1966.

F. Pawlowski, B. Uçar, and A. N. Yzelman, A multi-dimensional Morton-ordered block storage for mode-oblivious tensor computations, Journal of Computational Science, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02082524

E. Solomonik, D. Matthews, J. R. Hammond, J. F. Stanton, and J. Demmel, A massively parallel tensor contraction framework for coupled-cluster computations, Journal of Parallel and Distributed Computing, vol.74, issue.12, pp.3176-3190, 2014.

P. Springer and P. Bientinesi, Design of a high-performance gemm-like tensor-tensor multiplication, ACM TOMS, vol.44, issue.3, 2018.