Archiving Cold Data in Warehouses with Clustered Network Coding

Abstract : Modern storage systems now typically combine plain replication and erasure codes to reliably store large amount of data in datacenters. Plain replication allows a fast access to popular data, while erasure codes, e.g., Reed-Solomon codes, provide a storage-efficient alternative for archiving less popular data. Although erasure codes are now increasingly employed in real systems, they experience high overhead during maintenance, i.e., upon failures, typically requiring files to be decoded before being encoded again to repair the encoded blocks stored at the faulty node. In this paper, we propose a novel erasure code system, tailored for networked archival systems. The efficiency of our approach relies on the joint use of random codes and a clustered placement strategy. Our repair protocol leverages network coding techniques to reduce by 50% the amount of data transferred during maintenance, by repairing several cluster files simultaneously. We demonstrate both through an analysis and extensive experimental study conducted on a public testbed that our approach significantly decreases both the bandwidth overhead during the maintenance process and the time to repair lost data. We also show that using a non-systematic code does not impact the throughput, and comes only at the price of a higher CPU usage. Based on these results, we evaluate the impact of this higher CPU consumption on different configurations of data coldness by determining whether the cluster's network bandwidth dedicated to repair or CPU dedicated to decoding saturates first. on different configurations of data coldness by determining whether the cluster's network bandwidth dedicated to repair or CPU dedicated to decoding saturates first.
Type de document :
Communication dans un congrès
ACM New York, NY, USA. EuroSys 2014, Apr 2014, Amsterdam, Netherlands. 2014, 〈10.1145/2592798.2592816〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00994660
Contributeur : Fabien André <>
Soumis le : mercredi 21 mai 2014 - 19:02:37
Dernière modification le : mercredi 16 mai 2018 - 11:23:13

Identifiants

Citation

Fabien André, Anne-Marie Kermarrec, Erwan Le Merrer, Nicolas Le Scouarnec, Gilles Straub, et al.. Archiving Cold Data in Warehouses with Clustered Network Coding. ACM New York, NY, USA. EuroSys 2014, Apr 2014, Amsterdam, Netherlands. 2014, 〈10.1145/2592798.2592816〉. 〈hal-00994660〉

Partager

Métriques

Consultations de la notice

695