Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms - Archive ouverte HAL Access content directly
Journal Articles IEEE Transactions on Parallel and Distributed Systems Year : 2008

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms

(1) , (2) , (2) , (3)
1
2
3
Tchimou N'Takpé
  • Function : Author
  • PersonId : 836140
Frédéric Suter

Abstract

Applications structured as parallel task graphs exhibit both data and task parallelism, and arise in many domains. Scheduling these applications efficiently on parallel platforms has been a long-standing challenge. In the case of a single homogeneous platform, such as a cluster, results have been obtained both in theory, i.e., guaranteed algorithms, and in practice, i.e., pragmatic heuristics. Due to task parallelism these applications are well suited for execution on distributed platforms that span multiple clusters possibly in multiple institutions. However, the only available results in this context are non-guaranteed heuristics. In this paper we develop a scheduling algorithm, MCGAS, which is applicable to multi-cluster platforms that are almost homogeneous. Such platforms are often found as large subsets of multi-cluster platforms. Our novel contribution is that MCGAS computes task allocations so that a (tunable) performance guarantee is provided. Since a performance guarantee does not necessarily imply good average performance in practice, we also compare MCGAS with a recently proposed non-guaranteed algorithm. Using simulation over a wide range of experimental scenarios, we find that MCGAS leads to better average application makespans than its competitor.
Fichier principal
Vignette du fichier
dutot_et_al.pdf (460.31 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

inria-00347273 , version 1 (15-12-2008)

Identifiers

Cite

Pierre-Francois Dutot, Tchimou N'Takpé, Frédéric Suter, Henri Casanova. Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms. IEEE Transactions on Parallel and Distributed Systems, 2008, 20 (7), pp.940-952. ⟨10.1109/TPDS.2009.11⟩. ⟨inria-00347273⟩
141 View
236 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More