The Community for Technology Leaders
Issue No. 05 - May (2015 vol. 64)
ISSN: 0018-9340
pp: 1375-1388
Youjip Won , Division of Computer Science and Engineering, Hanyang University, Seoul, Korea
Kyeongyeol Lim , Division of Computer Science and Engineering, Hanyang University, Seoul, Korea
Jaehong Min , Division of Computer Science and Engineering, Hanyang University, Seoul, Korea
ABSTRACT
In this work, we developed a novel multithreaded variable size chunking method, MUCH, which exploits the multicore architecture of the modern microprocessors. The legacy single threaded variable size chunking method leaves much to be desired in terms of effectively exploiting the bandwidth of the state of the art storage devices. MUCH guarantees chunking invariability: The result of chunking does not change regardless of the degree of multithreading or the segment size. This is achieved by inter and intra-segment coalescing at the master thread and Dual Mode Chunking at the client thread. We developed an elaborate performance model to determine the optimal multithreading degree and the segment size. MUCH is implemented in the prototype deduplication system. By fully exploiting the available CPU cores (quad-core), we achieved up to $\times$ 4 increase in the chunking performance (MByte/sec). MUCH successfully addresses the performance issues of file chunking which is one of the performance bottlenecks in modern deduplication systems by parallelizing the file chunking operation while guaranteeing Chunking Invariability.
INDEX TERMS
Instruction sets, Bandwidth, Multithreading, Hardware, Upper bound, Central Processing Unit, Redundancy,multithread, Content-based chunking, deduplication
CITATION
Youjip Won, Kyeongyeol Lim, Jaehong Min, "MUCH: Multithreaded Content-Based File Chunking", IEEE Transactions on Computers, vol. 64, no. , pp. 1375-1388, May 2015, doi:10.1109/TC.2014.2322600