BlobSeer: Efficient Data Management for Data-Intensive Applications Distributed at Large-Scale

Bogdan Nicolae 1, * Gabriel Antoniu 1 Luc Bougé 1
* Corresponding author
1 KerData - Scalable Storage for Clouds and Beyond
IRISA-D1 - SYSTÈMES LARGE ÉCHELLE, Inria Rennes – Bretagne Atlantique
Abstract : Large-scale data-intensive applications are a class of applications that acquire and maintain massive datasets, while performing distributed computations on these datasets. In this context, a a key factor is the storage service responsible for the data management, as it has to efficiently deal with massively parallel data access in order to ensure scalability and performance for the whole system itself. This PhD thesis proposes BlobSeer, a data management service specifically designed to address the needs of large-scale data-intensive applications. Three key design factors: data striping, distributed metadata management and versioning-based concurrency control enable BlobSeer not only to provide efficient support for features commonly used to exploit data-level parallelism, but also enable exploring a set of new features that can be leveraged to further improve parallel data access. Extensive experimentations, both in scale and scope, on the Grid5000 testbed demonstrate clear benefits of using BlobSeer as the underlying storage for a variety of scenarios: data-intensive grid applications, grid file systems, MapReduce datacenters, desktop grids. Further work targets providing efficient storage solutions for cloud computing as well.
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/inria-00457809
Contributor : Bogdan Nicolae <>
Submitted on : Thursday, February 18, 2010 - 5:14:33 PM
Last modification on : Friday, November 16, 2018 - 1:39:15 AM
Long-term archiving on : Friday, June 18, 2010 - 9:20:36 PM

File

PID1121303.pdf
Files produced by the author(s)

Identifiers

Citation

Bogdan Nicolae, Gabriel Antoniu, Luc Bougé. BlobSeer: Efficient Data Management for Data-Intensive Applications Distributed at Large-Scale. IPDPS '10: Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing: Workshops and Phd Forum, Apr 2010, Atlanta, United States. pp.1-4, ⟨10.1109/IPDPSW.2010.5470802⟩. ⟨inria-00457809⟩

Share

Metrics

Record views

668

Files downloads

299