Analysis of the Repair Time in Distributed Storage Systems

Frédéric Giroire; Sandeep Kumar Gupta; Remigiusz Modrzejewski; Julian Monteiro; Stéphane Pérennes

Reports (Research Report) Year : 2011

Analysis of the Repair Time in Distributed Storage Systems

(1) , (2) , (1) , (1) , (1)

1
2

Frédéric Giroire

Function : Author
PersonId : 5597
IdHAL : frederic-giroire
ORCID : 0000-0002-3727-051X
IdRef : 11611598X

Algorithms, simulation, combinatorics and optimization for telecommunications

Sandeep Kumar Gupta

Function : Author
PersonId : 891944

Indian Institute of Technology Delhi

Remigiusz Modrzejewski

Function : Author
PersonId : 891945

Algorithms, simulation, combinatorics and optimization for telecommunications

Julian Monteiro

Function : Author
PersonId : 884907

Algorithms, simulation, combinatorics and optimization for telecommunications

Stéphane Pérennes

Function : Author
PersonId : 942945

Algorithms, simulation, combinatorics and optimization for telecommunications

Abstract

Distributed or peer-to-peer storage systems introduce redundancy to preserve the data in case of peer failures or departures. To ensure long-term fault tolerance, the storage system must have a self-repair service that continuously reconstructs lost fragments of redundancy. The speed of this reconstruction process is crucial for the data survival. This speed is mainly determined by available bandwidth, a critical resource of such systems. We propose a new analytical framework that takes into account the correlation of concurrent repairs when estimating the repair time and the probability of data loss. Mainly, we intro- duce queuing models in which reconstructions are served by peers at a rate that depends on the available bandwidth. The models and schemes proposed are validated by mathematical analysis, extensive set of simulations, and experimentation using the Grid'5000 test-bed platform.

Dans les systèmes de stockage distribués ou pair à pair, redondance des données doit être rajoutée afin de garantir l'intégrité du contenu en cas de panne ou de départ d'un pair. Afin d'assurer au système une résistance aux pannes sur le long terme, un processus interne doit continuellement reconstruire les fragments de redondance perdus. La vitesse de reconstruction de ces fragments des données est cruciale pour garantir l'intégrité du contenu. La bande passante disponible au sein du système déterminant en grande partie la vitesse de reconstruction. Une nouvelle méthode d'analyse est proposée prenant en compte la corrélation entre réparation simultanées lors de l'estimation du temps total de réparation et la probabilité de perte de données. Notre contribution principale est une modélisation basée sur le modèle des files d'attente dans laquelle les reconstructions sont effectuées par les pairs à un débit dépendant de la bande passante disponible. Ce modèle montre que pour la plupart des systèmes actuels, un temps de reconstruction exponentiel est inadéquate. Les modèles et schémas proposés ont été validés par analyse mathématique ainsi que par un grand nombre de simulations et expérimentations en utilisant la plateforme GRID'5000.

Keywords

P2P storage systems data lifetime queuing model regenerating codes per- formance evaluation

Domains

Networking and Internet Architecture [cs.NI]

Fichier principal

RR-7538.pdf (668.57 Ko)

Origin : Files produced by the author(s)

Julian Monteiro : Connect in order to contact the contributor

https://inria.hal.science/inria-00565359

Submitted on : Friday, February 11, 2011-9:00:01 PM

Last modification on : Monday, February 26, 2024-11:22:07 AM

Long-term archiving on: Sunday, December 4, 2016-3:00:18 AM

Dates and versions

inria-00565359 , version 1 (11-02-2011)

Identifiers

HAL Id : inria-00565359 , version 1

Cite

Frédéric Giroire, Sandeep Kumar Gupta, Remigiusz Modrzejewski, Julian Monteiro, Stéphane Pérennes. Analysis of the Repair Time in Distributed Storage Systems. [Research Report] RR-7538, INRIA. 2011, pp.28. ⟨inria-00565359⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA INRIA-RRRT I3S GRID5000 INRIA2 LARA UNIV-COTEDAZUR SILECS ANR

252 View

189 Download

Analysis of the Repair Time in Distributed Storage Systems

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share