The Community for Technology Leaders
2014 Second International Symposium on Computing and Networking (CANDAR) (2014)
Shizuoka, Japan
Dec. 10, 2014 to Dec. 12, 2014
ISBN: 978-1-4799-4152-0
pp: 226-230
ABSTRACT
Data deduplication technology usually identifies redundant data quickly and correctly by using bloom filter technology. A bloom filter can determine whether there is redundant data. However, there are the presences of false positives. In order to avoid false positives, we need to compare a new chunk with chunks that have been stored. In order to reduce the time to exclude the bloom filter false positives, current research uses many small size index tables to store chunk ID. However, the target chunk ID only stores in one index table. Searching for the target chunk ID at another index table uselessly took a great deal of time. In this paper, we cluster the stored chunks to reduce the time of excluding the false positive problem induced by bloom filter.
INDEX TERMS
Indexes, Arrays, Multimedia communication, Linux, Kernel, Cloud computing, System performance
CITATION

C. Tseng, J. Ciou and T. Liu, "A Cluster-Based Data De-duplication Technology," 2014 Second International Symposium on Computing and Networking (CANDAR), Shizuoka, Japan, 2014, pp. 226-230.
doi:10.1109/CANDAR.2014.22
318 ms
(Ver 3.3 (11022016))