C. Banino, O. Beaumont, L. Carter, J. Ferrante, A. Legrand et al., Scheduling strategies for master-slave tasking on heterogeneous processor platforms, IEEE TPDS, vol.15, issue.4, pp.319-330, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00789427

O. Beaumont, V. Boudet, A. Petitet, F. Rastello, and Y. Robert, A proposal for a heterogeneous cluster ScaLAPACK (dense linear solvers), IEEE Trans. Computers, vol.50, issue.10, pp.1052-1070, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808287

O. Beaumont, V. Boudet, F. Rastello, and Y. Robert, Matrix multiplication on heterogeneous platforms, IEEE Trans. Parallel Distributed Systems, vol.12, issue.10, pp.1033-1051, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808288

P. Bhat, C. Raghavendra, and V. Prasanna, Efficient collective communication in distributed heterogeneous systems, Journal of Parallel and Distributed Computing, vol.63, pp.251-263, 2003.

L. S. Blackford, J. Choi, A. Cleary, E. D'azevedo, J. Demmel et al., ScaLAPACK Users' Guide. SIAM, 1997.

L. E. Cannon, A cellular computer to implement the Kalman filter algorithm, 1969.

R. C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations of software and the atlas project, Parallel Computing, vol.27, issue.1-2, pp.3-35, 2001.

J. Dongarra, S. Hammarling, and D. Walker, Key concepts for parallel out-of-core LU factorization, Parallel Computing, vol.23, issue.1-2, pp.49-70, 1997.

J. Hong and H. Kung, I/O complexity: the red-blue pebble game, Proceedings of STOC'81, pp.326-333, 1981.

D. Ironya, S. Toledo, and A. Tiskin, Communication lower bounds for distributed-memory matrix multiplication, J. Parallel Distributed Computing, vol.64, issue.9, pp.1017-1026, 2004.

A. Kalinov and A. Lastovetsky, Heterogeneous distribution of computations solving linear algebra problems on networks of heterogeneous computers, J. Par. Distr. Computing, vol.61, issue.4, pp.520-535, 2001.

K. Li and V. Y. Pan, Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system, IEEE Trans. Computers, vol.50, issue.5, pp.519-525, 2001.

M. Maheswaran, S. Ali, H. Siegel, D. Hensgen, and R. Freund, Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems, Eighth Heterogeneous Computing Workshop, pp.30-44, 1999.

J. Pineau, Y. Robert, F. Vivien, Z. Shi, and J. Dongarra, Revisiting matrix product on master-worker platforms, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00803474

J. Pineau, Y. Robert, F. Vivien, Z. Shi, and J. Dongarra, Revisiting matrix product on master-worker platforms, IEEE Advances in Parallel and Distributed Computational Models, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00803474

T. Saif and M. Parashar, Understanding the behavior and performance of non-blocking communications in MPI, Proceedings of Euro-Par, vol.3149, pp.173-182, 2004.

S. Toledo, A survey of out-of-core algorithms in numerical linear algebra, External Memory Algorithms and Visualization, pp.161-180, 1999.

L. Zhuo and V. K. Prasanna, Scalable and modular algorithms for floating-point matrix multiplication on reconfigurable computing systems, IEEE TPDS, vol.18, issue.4, pp.433-448, 2007.