Analysis of the Repair Time in Distributed Storage Systems

Abstract : Distributed or peer-to-peer storage systems introduce redundancy to preserve the data in case of peer failures or departures. To ensure long-term fault tolerance, the storage system must have a self-repair service that continuously reconstructs lost fragments of redundancy. The speed of this reconstruction process is crucial for the data survival. This speed is mainly determined by available bandwidth, a critical resource of such systems. We propose a new analytical framework that takes into account the correlation of concurrent repairs when estimating the repair time and the probability of data loss. Mainly, we intro- duce queuing models in which reconstructions are served by peers at a rate that depends on the available bandwidth. The models and schemes proposed are validated by mathematical analysis, extensive set of simulations, and experimentation using the Grid'5000 test-bed platform.
Document type :
Reports
[Research Report] RR-7538, INRIA. 2011, pp.28
Liste complète des métadonnées

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/inria-00565359
Contributor : Julian Monteiro <>
Submitted on : Friday, February 11, 2011 - 9:00:01 PM
Last modification on : Saturday, September 17, 2016 - 1:27:33 AM
Document(s) archivé(s) le : Sunday, December 4, 2016 - 3:00:18 AM

File

RR-7538.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00565359, version 1

Collections

Citation

Frédéric Giroire, Sandeep Kumar Gupta, Remigiusz Modrzejewski, Julian Monteiro, Stéphane Pérennes. Analysis of the Repair Time in Distributed Storage Systems. [Research Report] RR-7538, INRIA. 2011, pp.28. 〈inria-00565359〉

Share

Metrics

Record views

410

Document downloads

216