14th IEEE International Symposium on Modeling, Analysis, and Simulation
Providing High Reliability in a Minimum Redundancy Archival Storage System
Monterey, CA
September 11-September 14
ISBN: 0-7695-2573-3
Inter-file compression techniques store files as sets of references to data objects or chunks that can be shared among many files. While these techniques can achieve much better compression ratios than conventional intra-file compression methods such as Lempel-Ziv compression, they also reduce the reliability of the storage system because the loss of a few critical chunks can lead to the loss of many files. We show how to eliminate this problem by choosing for each chunk a replication level that is a function of the amount of data that would be lost if that chunk were lost. Experiments using actual archival data show that our technique can achieve significantly higher robustness than a conventional approach combining data mirroring and intra-file compression while requiring about half the storage space.
Citation:
Deepavali Bhagwat, Kristal Pollack, Darrell D. E. Long, Thomas Schwarz, Ethan L. Miller, Jehan-Francois Paris, "Providing High Reliability in a Minimum Redundancy Archival Storage System," mascots, pp.413-421, 14th IEEE International Symposium on Modeling, Analysis, and Simulation, 2006