Issue No. 08 - Aug. (2015 vol. 64)
Yunfeng Zhu , AnHui Province Key Laboratory of High Performance Computing, School of Computer Science and Technology, University of Science & Technology of China, Hefei, Anhui, China
Jian Lin , Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T, Hong Kong
Patrick P. C. Lee , Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T, Hong Kong
Yinlong Xu , AnHui Province Key Laboratory of High Performance Computing, School of Computer Science and Technology, University of Science & Technology of China, Hefei, Anhui, China
Distributed storage systems provide large-scale data storage services, yet they are confronted with frequent node failures. To ensure data availability, a storage system often introduces data redundancy via replication or erasure coding. As erasure coding incurs significantly less redundancy overhead than replication under the same fault tolerance, it has been increasingly adopted in large-scale storage systems. In erasure-coded storage systems, degraded reads to temporarily unavailable data are very common, and hence boosting the performance of degraded reads becomes important. One challenge is that storage nodes tend to be heterogeneous with different storage capacities and I/O bandwidths. To this end, we propose FastDR, a system that addresses node heterogeneity and exploits I/O parallelism, so as to boost the performance of degraded reads to temporarily unavailable data. FastDR incorporates a greedy algorithm that seeks to reduce the data transfer cost of reading surviving data for degraded reads, while allowing the search of the efficient degraded read solution to be completed in a timely manner. We implement a FastDR prototype, and conduct extensive evaluation through simulation studies as well as testbed experiments on a Hadoop cluster with 10 storage nodes. We demonstrate that our FastDR achieves efficient degraded reads compared to existing approaches.
Encoding, Decoding, Parallel processing, Optimization, Equations, Redundancy, Bandwidth
Y. Zhu, J. Lin, P. P. Lee and Y. Xu, "Boosting Degraded Reads in Heterogeneous Erasure-Coded Storage Systems," in IEEE Transactions on Computers, vol. 64, no. 8, pp. 2145-2157, 2015.