Distributed Exact Deduplication for Primary Storage Infrastructures

Abstract : Deduplication of primary storage volumes in a cloud computing environment is increasingly desirable, as the resulting space savings contribute to the cost effectiveness of a large scale multi-tenant infrastructure. However, traditional archival and backup deduplication systems impose prohibitive overhead for latency-sensitive applications deployed at these infrastructures while, current primary deduplication systems rely on special cluster filesystems, centralized components, or restrictive workload assumptions.We present DEDIS, a fully-distributed and dependable system that performs exact and cluster-wide background deduplication of primary storage. DEDIS does not depend on data locality and works on top of any unsophisticated storage backend, centralized or distributed, that exports a basic shared block device interface. The evaluation of an open-source prototype shows that DEDIS scales out and adds negligible overhead even when deduplication and intensive storage I/O run simultaneously.
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/hal-01287732
Contributor : Hal Ifip <>
Submitted on : Monday, March 14, 2016 - 10:49:04 AM
Last modification on : Thursday, May 12, 2016 - 10:49:51 AM
Long-term archiving on : Sunday, November 13, 2016 - 5:48:41 PM

File

326177_1_En_5_Chapter.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

João Paulo, José Pereira. Distributed Exact Deduplication for Primary Storage Infrastructures. 4th International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2014, Berlin, Germany. pp.52-66, ⟨10.1007/978-3-662-43352-2_5⟩. ⟨hal-01287732⟩

Share

Metrics

Record views

179

Files downloads

100