The Community for Technology Leaders
Parallel and Distributed Processing Symposium, International (2011)
Anchorage, Alaska USA
May 16, 2011 to May 20, 2011
ISSN: 1530-2075
ISBN: 978-0-7695-4385-7
pp: 1266-1277
Due to the relatively low bandwidth of WAN (Wide Area Network) that supports cloud backup services, both the backup time and restore time in the cloud backup environment are in desperate need for reduction to make cloud backup a practical and affordable service for small businesses and telecommuters alike. Existing solutions that employ the deduplication technology for cloud backup services only focus on removing redundant data from transmission during backup operations to reduce the backup time, while paying little attention to the restore time that we argue is an important aspect and affects the overall quality of service of the cloud backup services. In this paper, we propose a \emph{CAusality-Based deduplication performance booster for both cloud backup and restore operations}, called CABdedupe, which captures the causal relationship among chronological versions of datasets that are processed in multiple backups/restores, to remove the unmodified data from transmission during not only backup operations but also restore operations, thus to improve both the backup and restore performances. CABdedupe is a middleware that is orthogonal to and can be integrated into any existing backup system. Our extensive experiments, where we integrate CABdedupe into two existing backup systems and feed real world datasets, show that both the backup time and restore time are significantly reduced, with a reduction ratio of up to $103:1$.

Z. Yan, D. Feng, Y. Tan, L. Tian and H. Jiang, "CABdedupe: A Causality-Based Deduplication Performance Booster for Cloud Backup Services," 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011)(IPDPS), Anchorage, AK, 2011, pp. 1266-1277.
95 ms
(Ver 3.3 (11022016))