|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Jaehong Min, Daeyoung Yoon, Youjip Won, "Efficient Deduplication Techniques for Modern Backup Operation," IEEE Transactions on Computers, vol. 60, no. 6, pp. 824-840, June, 2011. | |||
| BibTex | x | ||
| @article{ 10.1109/TC.2010.263, author = {Jaehong Min and Daeyoung Yoon and Youjip Won}, title = {Efficient Deduplication Techniques for Modern Backup Operation}, journal ={IEEE Transactions on Computers}, volume = {60}, number = {6}, issn = {0018-9340}, year = {2011}, pages = {824-840}, doi = {http://doi.ieeecomputersociety.org/10.1109/TC.2010.263}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - Efficient Deduplication Techniques for Modern Backup Operation IS - 6 SN - 0018-9340 SP824 EP840 EPD - 824-840 A1 - Jaehong Min, A1 - Daeyoung Yoon, A1 - Youjip Won, PY - 2011 KW - Deduplication KW - chunking KW - backup KW - index partitioning KW - fingerprint lookup. VL - 60 JA - IEEE Transactions on Computers ER - | |||
[1] J. Gantz, C. Chute, A. Manfrediz, S. Minton, D. Reinsel, W. Schlichting, and A. Toncheva, The Diverse and Exploding Digital Universe: An Updated Forecast of Worldwide Information Growth through 2011, IDC, An IDC White Paper-Sponsored by EMC, Mar. 2008.
[2] W. Tichy, "Rcs: A System for Version Control," Software Practice and Experience, vol. 15, no. 7, pp. 637-654, July 1985.
[3] M. Ajtai, R. Burns, R. Fagin, D. Long, and L. Stockmeyer, "Compactly Encdoing Unstructured Input with Differential Compression," J. ACM, vol. 49, no. 3, pp. 318-367, May 2002.
[4] P. Kulkarni, F. Douglis, J. LaVoie, and J. Tracey, "Redundancy Elimination within Large Collections of Files," Proc. USENIX Ann. Technical Conf., General Track, pp. 59-72, 2004.
[5] F. Douglis and A. Iyengar, "Application-Specific Delta-Encoding via Resemblance Detection," Proc. Conf. USENIX '03, June 2003.
[6] Y. Won, J. Ban, J. Min, J. Hur, S. Oh, and J. Lee, "Efficient Index Lookup for De-Duplication Backup System," Proc. IEEE Int'l Symp. Modeling, Analysis and Simulation of Computers and Telecomm. Systems (MASCOTS '08), pp. 1-3, Sept. 2008.
[7] B. Zhu, K. Li, and H. Patterson, "Avoiding the Disk Bottleneck in the Data Domain Deduplication File System," Proc. FAST '08: Sixth USENIX Conf. File and Storage Technologies, pp. 1-14, 2008.
[8] A. Muthitacharoen, B. Chen, and D. Mazières, "A Low-bandwidth Network File System," SIGOPS Operating Systems Rev., vol. 35, no. 5, pp. 174-187, 2001.
[9] B. Hong and D.D.E. Long, "Duplicate Data Elimination in a San File System," Proc. 21st IEEE / 12th NASA Goddard Conf. Mass Storage Systems and Technologies (MSST), pp. 301-314, Apr. 2004.
[10] H.P. nd David Andersen and M. Kaminsky, "Exploiting Similarity for Multi-Source Downloads Using File Handprints," Proc. Symp. Networked Systems Design Implementation (NSDI '07), Apr. 2007.
[11] M. Mitzenmacher, "Compressed Bloom Filters," IEEE/ACM Trans. Networking, vol. 10, no. 5, pp. 604-612, Oct. 2002.
[12] N.T. Spring and D. Wetherall, "A Protocol-Independent Technique for Eliminating Redundant Network Traffic," Proc. SIGCOMM, pp. 87-95, 2000.
[13] Y. Won, R. Kim, J. Ban, J. Hur, S. Oh, and J. Lee, "Prun: Eliminating Information Redundancy for Large Scale Data Backup System," Proc. IEEE Int'l Conf. Computational Sciences and Its Applications(ICCSA '08), 2008.
[14] S. Quinlan and S. Dorward, "Venti: A New Approach to Archival Storage," Proc. Conf. File and Storage Technologies (FAST '02), pp. 89-101, Jan. 2002.
[15] J.C. Mogul, Y.M. Chan, and T. Kelly, "Design, Implementation, and Evaluation of Duplicate Transfer Detection in http," Proc. Symp. Networked Systems Design Implementation (NSDI '04), p. 4, 2004.
[16] L.P. Cox, C.D. Murray, and B.D. Noble, "Pastiche: Making Backup Cheap and Easy," SIGOPS Operating Systems Rev., vol. 36, no. SI, pp. 285-298, 2002.
[17] C. Policroniades and I. Pratt, "Alternatives for Detecting Redundancy in Storage Systems Data," Proc. Conf. USEXNIX '04, June 2004.
[18] C. Liu, Y. Lu, C. Shi, G. Lu, D. Du, and D. Wang, "ADMAD: Application-Driven Metadata Aware De-Duplication Archival Storage System," Proc. Fifth IEEE Int'l Workshop Storage Network Architecture and Parallel I/Os ( SNAPI '08), pp. 29-35, 2008.
[19] D. Meister and A. Brinkmann, "Multi-Level Comparison of Data Deduplication in a Backup Scenario," Proc. SYSTOR '09: The Israeli Experimental Systems Conf., pp. 1-12, May 2009.
[20] N. Mandagere, P. Zhou, M. Smith, and S. Uttamchandani, "Demystifying Data Deduplication," Proc. ACM/IFIP/USENIX Middleware '08 Conf. Companion, pp. 12-17, Dec. 2008.
[21] W.J. Bolosky, S. Corbin, D. Goebel, and J.R. Douceur, "Single Instance Storage in Windows 2000," Proc. Fourth USENIX Windows Systems Symp., pp. 13-24, 2000.
[22] L.L. You, K.T. Pollack, and D.D.E. Long, "Deep Store: An Archival Storage System Architecture," Proc. Int'l Conf. Data Engineering (ICDE '05), pp. 804-8015, 2005.
[23] B.H. Bloom, "Space/Time Trade-Offs in Hash Coding with Allowable Errors," Comm. ACM, vol. 13, no. 7, pp. 422-426, 1970.
[24] F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R.E. Gruber, "Bigtable: A Distributed Storage System for Structured Data," Proc. Symp. Operating Systems Design and Implementation (OSDI '06), pp. 205-218, 2006.
[25] M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar, G. Trezise, and P. Camble, "Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality," Proc. Seventh USENIX Conf. File and Storage Technologies (FAST '09), 2009.
[26] D.R. Bobbarjung, S. Jagannathan, and C. Dubnicki, "Improving Duplicate Elimination in Storage Systems," ACM Trans. Storage, vol. 2, no. 4, pp. 424-448, 2006.
[27] L. Aronovich, R. Asher, E. Bachmat, H. Bitner, M. Hirsch, and S. Klein, "The Design of a Similarity Based Deduplication System," Proc. SYSTOR '09: The Israeli Experimental Systems Conf., pp. 1-14, May 2009.
[28] J. Hamilton and E. Olsen, "Design and Implementation of a Storage Repository Using Commonality Factoring," Proc. 20th IEEE/11th NASA Goddard Conf. Mass Storage Systems and Technologies(MSS '03), Aug. 2003.
[29] D. Bhagwat, K. Eshghi, D. Long, and M. Lillibridge, "Extreme Binning: Scalable, Parallel Deduplication for Chunk-Based File Backup," Proc. 17th IEEE Int'l Symp. Modeling, Analysis, and Simulation of Computer and Telecomm. Systems (MASCOTS '09), Sept. 2009.
[30] A. Leung, M. Shao, T. Bisson, S. Pasupathy, and E. Miller, "Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems," Proc. Six USENIX Conf. File and Storage Technologies (FAST '09), 2009.
[31] C. Liu, Y. Gu, L. Sun, B. Yan, and D. Wang, "R-ADMAD: High Reliability Provision for Large-Scale De-Duplication Archival Storage Systems," Proc. 23rd Int'l Conf. Supercomputing, (ICS '09), pp. 370-379, 2009.
[32] D. Bhagwat, K. Pollack, D. Long, T. Schwarz, E. Miller, and J. Pâris, "Providing High Reliability in a Minimum Redundancy Archival Storage System," Proc. 14th IEEE Int'l Symp. Modeling, Analysis, and Simulation of Computer and Telecomm. Systems (MASCOTS '06), 2006.
[33] P. Efstathopoulos and F. Guo, "Rethinking Deduplication Scalability," HotStorage '10, Second Workshop Hot Topics in Storage and File Systems, June 2010.
[34] J. Burrows and D.O.C.W. DC, "Secure Hash Standard," Federal Information Processing Standards Publication, Apr. 1995.
[35] R. Rivest, "The MD5 Message Digest Algorithm, RFC 1321," Internet Activities Board, 1992.
[36] V. Henson, "An Analysis of Compare-by-Hash," Proc. Conf. Hot Topics in Operating Systems (HOTOS '03), 2003.
[37] "Berkeley db," http://www.oracle.com/technology/products/ berkeley db/dbindex.html, 2011.
[38] A. Broder and M. Mitzenmacher, "Network Applications of Bloom Filters: A Survey," Internet Math., vol. 1, no. 4, pp. 485-509, 2004.
[39] N. Jain, M. Dahlin, and R. Tewari, "Taper: Tiered Approach for Eliminating Redundancy in Replica Synchronization," Proc. FAST '05: Fourth Conf. USENIX File and Storage Technologies, pp. 21-21, 2005.
[40] E. Horowitz, S. Sahni, and D. Mehta, Fundamentals of Data Structures in C++. Computer Science Press, 1995.
[41] A.Z. Broder and M. Mitzenmacher, "Network Applications of Bloom Filters: A Survey," Internet Math., vol. 1, no. 4, pp. 485-509, 2003.
[42] L. Fan, P. Cao, J. Almeida, and A. Broder, "Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol," IEEE/ACM Trans. Networking (TON), vol. 8, no. 3, pp. 281-293, June 2000.
[43] P. Reynolds and A. Vahdat, "Efficient Peer-to-Peer Keyword Searching," Lecture Notes in Computer Science, pp. 21-40, Springer, 2003.

