The Community for Technology Leaders
Parallel and Distributed Processing Symposium, International (2010)
Atlanta, GA, USA
Apr. 19, 2010 to Apr. 23, 2010
ISBN: 978-1-4244-6442-5
pp: 1-12
Tianming Yang , School of Computer, Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, 430074, China
Hong Jiang , Department of Computer Science and Engineering, University of Nebraska-Lincoln Lincoln, NE 68588, USA
Dan Feng , School of Computer, Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, 430074, China
Zhongying Niu , School of Computer, Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, 430074, China
Ke Zhou , School of Computer, Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, 430074, China
Yaping Wan , School of Computer, Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, 430074, China
ABSTRACT
Driven by the increasing demand for large-scale and high-performance data protection, disk-based de-duplication storage has become a new research focus of the storage industry and research community where several new schemes have emerged recently. So far these systems are mainly inline de-duplication approaches, which are centralized and do not lend themselves easily to be extended to handle global de-duplication in a distributed environment. We present DEBAR, a de-duplication storage system designed to improve capacity, performance and scalability for de-duplication backup/archiving. DEBAR performs post-processing de-duplication, where backup streams are de-duplicated and cached on server-disks through an in-memory preliminary filter in phase I, and then completely de-duplicated in-batch in phase II. By decentralizing fingerprint lookup and update, DEBAR supports a cluster of servers to perform de-duplication backup in parallel, and is shown to scale linearly in both write throughput and physical capacity, achieving an aggregate throughput of 1.7GB/s and supporting a physical capacity of 2PB with 16 backup servers.
INDEX TERMS
CITATION

H. Jiang, T. Yang, K. Zhou, Z. Niu, D. Feng and Y. Wan, "DEBAR: A scalable high-performance de-duplication storage system for backup and archiving," 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Atlanta, GA, 2010, pp. 1-12.
doi:10.1109/IPDPS.2010.5470468
89 ms
(Ver 3.3 (11022016))