Skip to Main content Skip to Navigation
Conference papers

Agar: A Caching System for Erasure-Coded Data

Abstract : Erasure coding is an established data protection mechanism. It provides high resiliency with low storage overhead, which makes it very attractive to storage systems developers. Unfortunately, when used in a distributed setting, erasure coding hampers a storage system's performance, because it requires clients to contact several, possibly remote sites to retrieve their data. This has hindered the adoption of erasure coding in practice, limiting its use to cold, archival data. Recent research showed that it is feasible to use erasure coding for hot data as well, thus opening new perspectives for improving erasure-coded storage systems. In this paper, we address the problem of minimizing access latency in erasure-coded storage. We propose Agar—a novel caching system tailored for erasure-coded content. Agar optimizes the contents of the cache based on live information regarding data popularity and access latency to different data storage sites. Our system adapts a dynamic programming algorithm to optimize the choice of data blocks that are cached, using an approach akin to " Knapsack " algorithms. We compare Agar to the classical Least Recently Used and Least Frequently Used cache eviction policies, while varying the amount of data cached between a data chunk and a whole replica of the object. We show that Agar can achieve 16% to 41% lower latency than systems that use classical caching policies.
Complete list of metadata

Cited literature [23 references]  Display  Hide  Download

https://hal.inria.fr/hal-01617146
Contributor : François Taïani <>
Submitted on : Monday, October 16, 2017 - 11:35:27 AM
Last modification on : Thursday, January 7, 2021 - 4:35:55 PM
Long-term archiving on: : Wednesday, January 17, 2018 - 12:54:14 PM

File

paper.pdf
Files produced by the author(s)

Identifiers

Citation

Raluca Halalai, Pascal Felber, Anne-Marie Kermarrec, François Taïani. Agar: A Caching System for Erasure-Coded Data. ICDCS 2017 - 37th IEEE International Conference on Distributed Computing Systems, Jun 2017, Atlanta, GA, United States. pp.1-11, ⟨10.1109/ICDCS.2017.97⟩. ⟨hal-01617146⟩

Share

Metrics

Record views

438

Files downloads

808