Multisite Management of Data-intensive Scientific Workflows in the Cloud

Ji Liu 1, 2, *
* Auteur correspondant
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : The current solutions for the parallel execution of scientific workflows are appropriate for static computing and storage resources in a grid environment. They have been extended to deal with more elastic resources in a cloud, but with only one site. Our analysis [1] of the current techniques of scientific workflow parallelization and scientific workflow execution has shown that there is a lot of room for improvement in the following directions: 1. Data staging: existing techniques mainly focus on the mechanism that starts scientific workflow execution after gathering all the related data in a shared-disk file system at one data center, which is time consuming. 2. Architecture: the structure of SWfMSs is generally centralized, with a master node, which is a single point of failure and performance bottleneck, managing all the optimization and scheduling processes. 3. Task scheduling and data location: most SWfMSs do not take data location into account during task scheduling, which makes it inefficient to read or write data. 4. Multisite: novel task and data scheduling approaches are required for utilizing resources in a multisite cloud. In the rest of this paper, we define more precisely the problem and introduce our approach to address it.
Type de document :
Communication dans un congrès
BDA: Bases de Données Avancées, Oct 2014, Autrans, France. pp.28-30, 2014, Gestion de données - principes, technologies et applications. 〈http://bda2014.imag.fr〉
Liste complète des métadonnées

Littérature citée [6 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01169960
Contributeur : David Gross-Amblard <>
Soumis le : mardi 30 juin 2015 - 15:34:32
Dernière modification le : jeudi 24 mai 2018 - 15:59:21
Document(s) archivé(s) le : mardi 25 avril 2017 - 20:25:12

Fichier

bda2014-actes-phd-5-pp28-30.pd...
Fichiers éditeurs autorisés sur une archive ouverte

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale - Pas de modification 4.0 International License

Identifiants

  • HAL Id : hal-01169960, version 1

Collections

Citation

Ji Liu. Multisite Management of Data-intensive Scientific Workflows in the Cloud. BDA: Bases de Données Avancées, Oct 2014, Autrans, France. pp.28-30, 2014, Gestion de données - principes, technologies et applications. 〈http://bda2014.imag.fr〉. 〈hal-01169960〉

Partager

Métriques

Consultations de la notice

597

Téléchargements de fichiers

191