The Community for Technology Leaders
Green Image
Issue No. 04 - Aug. (2017 vol. 25)
ISSN: 1063-6692
pp: 2040-2053
Yiming Zhang , National Laboratory for Parallel and Distributed Processing, School of Computer, National University of Defense Technology, Changsha, China
Dongsheng Li , National Laboratory for Parallel and Distributed Processing, School of Computer, National University of Defense Technology, Changsha, China
Chuanxiong Guo , Microsoft Azure, Sammamish, WA, USA
Haitao Wu , Microsoft Azure, Sammamish, WA, USA
Yongqiang Xiong , Microsoft Research Asia, Beijing, China
Xicheng Lu , National Laboratory for Parallel and Distributed Processing, School of Computer, National University of Defense Technology, Changsha, China
ABSTRACT
In-memory storage has the benefits of low I/O latency and high I/O throughput. Fast failure recovery is crucial for large-scale in-memory storage systems, bringing network-related challenges, including false detection due to transient network problems, traffic congestion during the recovery, and top-of-rack switch failures. In order to achieve fast failure recovery, in this paper, we present CubicRing, a distributed structure for cube-based networks, which exploits network proximity to restrict failure detection and recovery within the smallest possible one-hop range. We leverage the CubicRing structure to address the aforementioned challenges and design a network-aware in-memory key-value store called MemCube. In a 64-node 10GbE testbed, MemCube recovers 48 GB of data for a single server failure in 3.1 s. The 14 recovery servers achieve 123.9 Gb/s aggregate recovery throughput, which is 88.5% of the ideal aggregate bandwidth and several times faster than RAMCloud with the same configurations.
INDEX TERMS
Servers, Switches, Random access memory, Bandwidth, Throughput, Aggregates, Data models
CITATION

Y. Zhang, D. Li, C. Guo, H. Wu, Y. Xiong and X. Lu, "CubicRing: Exploiting Network Proximity for Distributed In-Memory Key-Value Store," in IEEE/ACM Transactions on Networking, vol. 25, no. 4, pp. 2040-2053, 2017.
doi:10.1109/TNET.2017.2669215
199 ms
(Ver 3.3 (11022016))