BlobSeer: Efficient Data Management for Data-Intensive Applications Distributed at Large-Scale

Bogdan Nicolae 1, * Gabriel Antoniu 1 Luc Bougé 1
* Auteur correspondant
1 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
Abstract : Large-scale data-intensive applications are a class of applications that acquire and maintain massive datasets, while performing distributed computations on these datasets. In this context, a a key factor is the storage service responsible for the data management, as it has to efficiently deal with massively parallel data access in order to ensure scalability and performance for the whole system itself. This PhD thesis proposes BlobSeer, a data management service specifically designed to address the needs of large-scale data-intensive applications. Three key design factors: data striping, distributed metadata management and versioning-based concurrency control enable BlobSeer not only to provide efficient support for features commonly used to exploit data-level parallelism, but also enable exploring a set of new features that can be leveraged to further improve parallel data access. Extensive experimentations, both in scale and scope, on the Grid5000 testbed demonstrate clear benefits of using BlobSeer as the underlying storage for a variety of scenarios: data-intensive grid applications, grid file systems, MapReduce datacenters, desktop grids. Further work targets providing efficient storage solutions for cloud computing as well.
Type de document :
Communication dans un congrès
IPDPS '10: Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing: Workshops and Phd Forum, Apr 2010, Atlanta, United States. pp.1-4, 2009, 〈10.1109/IPDPSW.2010.5470802〉
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00457809
Contributeur : Bogdan Nicolae <>
Soumis le : jeudi 18 février 2010 - 17:14:33
Dernière modification le : vendredi 16 novembre 2018 - 01:39:15
Document(s) archivé(s) le : vendredi 18 juin 2010 - 21:20:36

Fichier

PID1121303.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Bogdan Nicolae, Gabriel Antoniu, Luc Bougé. BlobSeer: Efficient Data Management for Data-Intensive Applications Distributed at Large-Scale. IPDPS '10: Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing: Workshops and Phd Forum, Apr 2010, Atlanta, United States. pp.1-4, 2009, 〈10.1109/IPDPSW.2010.5470802〉. 〈inria-00457809〉

Partager

Métriques

Consultations de la notice

647

Téléchargements de fichiers

235