Towards Multi-site Metadata Management for Geographically Distributed Cloud Workflows

Abstract : With their globally distributed datacenters, clouds now provide an opportunity to run complex large-scale applications on dynamically provisioned, networked and federated infrastructures. However, there is a lack of tools supporting data-intensive applications across geographically distributed sites. For instance, scientific workflows which handle many small files can easily saturate state-of-the-art distributed filesystems based on centralized metadata servers (e.g. HDFS, PVFS). In this paper, we explore several alternative design strategies to efficiently support the execution of existing workflow engines across multi-site clouds, by reducing the cost of metadata operations. These strategies leverage workflow semantics in a 2-level metadata partitioning hierarchy that combines distribution and replication. The system was validated on the Microsoft Azure cloud across 4 EU and US datacenters. The experiments were conducted on 128 nodes using synthetic benchmarks and real-life applications. We observe as much as 28% gain in execution time for a parallel, geo-distributed real-world application (Montage) and up to 50% for a metadata-intensive synthetic benchmark, compared to a baseline centralized configuration.
Type de document :
Communication dans un congrès
CLUSTER 2015 - IEEE International Conference on Cluster Computing, Sep 2015, Chicago, United States. 2015, 〈10.1109/CLUSTER.2015.49〉
Liste complète des métadonnées

Littérature citée [30 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01239150
Contributeur : Luis Pineda-Morales <>
Soumis le : lundi 7 décembre 2015 - 14:38:45
Dernière modification le : mercredi 16 mai 2018 - 11:23:28
Document(s) archivé(s) le : mardi 8 mars 2016 - 13:45:15

Fichier

cluster.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Luis Pineda-Morales, Alexandru Costan, Gabriel Antoniu. Towards Multi-site Metadata Management for Geographically Distributed Cloud Workflows. CLUSTER 2015 - IEEE International Conference on Cluster Computing, Sep 2015, Chicago, United States. 2015, 〈10.1109/CLUSTER.2015.49〉. 〈hal-01239150〉

Partager

Métriques

Consultations de la notice

590

Téléchargements de fichiers

287