M. Banikazemi, V. Moorthy, and D. K. Panda, Efficient collective communication on heterogeneous networks of workstations, Proceedings of the 27th International Conference on Parallel Processing (ICPP'98), 1998.

M. Banikazemi, J. Sampathkumar, S. Prabhu, D. Panda, and P. Sadayappan, Communication modeling of heterogeneous networks of workstations for performance characterization of collective operations, HCW'99, the 8th Heterogeneous Computing Workshop, pp.125-133, 1999.

C. Banino, O. Beaumont, L. Carter, J. Ferrante, A. Legrand et al., Scheduling strategies for master-slave tasking on heterogeneous processor platforms, IEEE Trans. Parallel Distributed Systems, vol.15, issue.4, pp.319-330, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00789427

A. Bar-noy, S. Guha, J. S. Naor, and B. Schieber, Message multicasting in heterogeneous networks, SIAM Journal on Computing, vol.30, issue.2, pp.347-358, 2000.

O. Beaumont, V. Boudet, A. Petitet, F. Rastello, and Y. Robert, A proposal for a heterogeneous cluster ScaLAPACK (dense linear solvers), IEEE Trans. Computers, vol.50, issue.10, pp.1052-1070, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808287

O. Beaumont, V. Boudet, F. Rastello, and Y. Robert, Matrix multiplication on heterogeneous platforms, IEEE Trans. Parallel Distributed Systems, vol.12, issue.10, pp.1033-1051, 2001.
URL : https://hal.archives-ouvertes.fr/hal-00808288

O. Beaumont, V. Boudet, F. Rastello, and Y. Robert, Partitioning a square into rectangles: NP-completeness and approximation algorithms, Algorithmica, vol.34, pp.217-239, 2002.
URL : https://hal.archives-ouvertes.fr/hal-00807407

O. Beaumont, L. Carter, J. Ferrante, A. Legrand, L. Marchal et al., Centralized versus distributed schedulers for multiple bag-of-task applications, International Parallel and Distributed Processing Symposium IPDPS, 2006.

F. Berman, High-performance schedulers, The Grid: Blueprint for a New Computing Infrastructure, pp.279-309, 1999.

P. Bhat, C. Raghavendra, and V. Prasanna, Efficient collective communication in distributed heterogeneous systems, ICDCS'99 19th International Conference on Distributed Computing Systems, pp.15-24, 1999.

P. Bhat, C. Raghavendra, and V. Prasanna, Efficient collective communication in distributed heterogeneous systems, Journal of Parallel and Distributed Computing, vol.63, pp.251-263, 2003.

L. Blackford, J. Choi, A. Cleary, J. Demmel, I. Dhillon et al., ScaLAPACK: A portable linear algebra library for distributed-memory computers -design issues and performance, Supercomputing '96, 1996.

L. S. Blackford, J. Choi, A. Cleary, E. Azevedo, J. Demmel et al., ScaLAPACK Users' Guide. SIAM, 1997.

L. E. Cannon, A cellular computer to implement the Kalman filter algorithm, 1969.

A. Chakravarti, G. Baumgartner, and M. Lauria, Self-organizing scheduling on the organic grid, Int. Journal of High Performance Computing Applications, vol.20, issue.1, pp.115-130, 2006.

Z. Chen, J. Dongarra, P. Luszczek, and K. Roche, Self adapting software for numerical linear algebra and lapack for clusters, Parallel Computing, vol.29, pp.1723-1743, 2003.

M. Cierniak, M. Zaki, and W. Li, Compile-time scheduling algorithms for heterogeneous network of workstations, The Computer Journal, vol.40, issue.6, pp.356-372, 1997.

M. Cierniak, M. Zaki, and W. Li, Customized dynamic load balancing for a network of workstations, Journal of Parallel and Distributed Computing, vol.43, pp.156-162, 1997.

T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms, 1990.

P. E. Crandall and M. J. Quinn, Block data decomposition for data-parallel programming on a heterogeneous workstation network, 2nd International Symposium on High Performance Distributed Computing, pp.42-49, 1993.
DOI : 10.1109/hpdc.1993.263859

J. Cuenca, L. P. Garcia, D. Gimenez, and J. Dongarra, Processes distribution of homogeneous parallel linear algebra routines on heterogeneous clusters, HeteroPar'2005: International Conference on Heterogeneous Computing, 2005.

J. Dongarra, S. Hammarling, and D. Walker, Key concepts for parallel out-of-core LU factorization, Parallel Computing, vol.23, issue.1-2, pp.49-70, 1997.

J. P. Goux, S. Kulkarni, J. Linderoth, and M. Yoder, An enabling framework for master-worker applications on the computational grid, Ninth IEEE International Symposium on High Performance Distributed Computing (HPDC'00), 2000.

E. Heymann, M. A. Senar, E. Luque, and M. Livny, Adaptive scheduling for masterworker applications on the computational grid, Grid Computing -GRID 2000, pp.214-227, 1971.

B. Hong and V. Prasanna, Bandwidth-aware resource allocation for heterogeneous computing systems to maximize throughput, Proceedings of the 32th International Conference on Parallel Processing, 2003.

J. Hong and H. Kung, I/O complexity: the red-blue pebble game, STOC '81: Proceedings of the 13th ACM symposium on Theory of Computing, pp.326-333, 1981.

D. Ironya, S. Toledo, and A. Tiskin, Communication lower bounds for distributedmemory matrix multiplication, J. Parallel Distributed Computing, vol.64, issue.9, pp.1017-1026, 2004.

M. Kaddoura, S. Ranka, and A. Wang, Array decomposition for nonuniform computational environments, Journal of Parallel and Distributed Computing, vol.36, pp.91-105, 1996.
DOI : 10.1006/jpdc.1996.0092

URL : https://surface.syr.edu/cgi/viewcontent.cgi?article=1004&context=lcsmith_other

A. Kalinov and A. Lastovetsky, Heterogeneous distribution of computations while solving linear algebra problems on networks of heterogeneous computers, LNCS, vol.1593, pp.191-200, 1999.

S. Khuller and Y. Kim, On broadcasting in heterogenous networks, Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms, pp.1011-1020, 2004.

A. Lastovetsky and R. Reddy, Data partitioning with a realistic performance model of networks of heterogeneous computers, International Parallel and Distributed Processing Symposium IPDPS, 2004.

P. Liu, Broadcast scheduling optimization for heterogeneous cluster systems, Journal of Algorithms, vol.42, issue.1, pp.135-152, 2002.
DOI : 10.1006/jagm.2001.1204

URL : http://ce.sharif.ac.ir/~ghodsi/archive/d-papers/SPAA/2000/Broadcast scheduling optimization for heterogeneous cluster systems.pdf

M. Maheswaran, S. Ali, H. Siegel, D. Hensgen, and R. Freund, Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems, Eight Heterogeneous Computing Workshop, pp.30-44, 1999.

C. D. Polychronopoulos, Compiler optimization for enhancing parallelism and their impact on architecture design, IEEE Transactions on Computers, vol.37, issue.8, pp.991-1004, 1988.

T. Saif and M. Parashar, Understanding the behavior and performance of non-blocking communications in MPI, Proceedings of Euro-Par, vol.3149, pp.173-182, 2004.

G. Shao, Adaptive scheduling of master/worker applications on distributed computational resources, 2001.

G. Shao, F. Berman, and R. Wolski, Master/slave computing on the grid, Heterogeneous Computing Workshop HCW'00, 2000.

S. Toledo, A survey of out-of-core algorithms in numerical linear algebra, External Memory Algorithms and Visualization, pp.161-180, 1999.

J. B. Weissman, Scheduling multi-component applications in heterogeneous wide-area networks, Heterogeneous Computing Workshop HCW'00, 2000.
DOI : 10.1109/hcw.2000.843745

R. C. Whaley and J. Dongarra, Automatically tuned linear algebra software, Proceedings of the ACM/IEEE Symposium on Supercomputing (SC'98), 1998.
DOI : 10.1109/sc.1998.10004

URL : http://www.netlib.org/lapack/lawnspdf/lawn131.pdf

. Unité-de-recherche-inria-futurs, Parc Club Orsay Université -ZAC des Vignes 4, rue Jacques Monod -91893 ORSAY Cedex

. Unité-de-recherche-inria-lorraine, LORIA, Technopôle de Nancy-Brabois -Campus scientifique 615, rue du Jardin Botanique -BP 101 -54602