2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) (2017)
Atlanta, Georgia, USA
June 5, 2017 to June 8, 2017
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDCS.2017.97
Erasure coding is an established data protection mechanism. It provides high resiliency with low storage overhead, which makes it very attractive to storage systems developers. Unfortunately, when used in a distributed setting, erasure coding hampers a storage system's performance, because it requires clients to contact several, possibly remote sites to retrieve their data. This has hindered the adoption of erasure coding in practice, limiting its use to cold, archival data. Recent research showed that it is feasible to use erasure coding for hot data as well, thus opening new perspectives for improving erasure-coded storage systems. In this paper, we address the problem of minimizing access latency in erasure-coded storage. We propose Agar-a novel caching system tailored for erasure-coded content. Agar optimizes the contents of the cache based on live information regarding data popularity and access latency to different data storage sites. Our system adapts a dynamic programming algorithm to optimize the choice of data blocks that are cached, using an approach akin to "Knapsack" algorithms. We compare Agar to the classical Least Recently Used and Least Frequently Used cache eviction policies, while varying the amount of data cached between a data chunk and a whole replica of the object. We show that Agar can achieve 16% to 41% lower latency than systems that use classical caching policies.
Encoding, Distributed databases, Electronic mail, Heuristic algorithms, Redundancy, Bandwidth, Algorithm design and analysis
R. Halalai, P. Felber, A. Kermarrec and F. Taiani, "Agar: A Caching System for Erasure-Coded Data," 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, Georgia, USA, 2017, pp. 23-33.