June 13, 2005 to June 16, 2005
Ying Chen , IBM Almaden Research Center
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICAC.2005.35
The foremost crucial step towards a fully automated Information Lifecycle Management (ILM) is to differentiate information by values in an unbiased manner and understand how values change over time. This paper presents an information valuation approach that quantifies the value of a given piece of information based on its usage over time. Our case study based on several real world NFS file server traces collected from Harvard University shows that such a model is simple, effective, and tangible since it relies on measurable metrics and observable facts. It captures the changing nature of the file value throughout their lifecycles, reflects the value differences among different files, and hence allows one to compare and classify files. More importantly, through additional analysis of the model outputs one can gain new insights into files, e.g., what files are most valuable and when. We show that files in different value classes exhibit different characteristics and can be characterized by unique sets of attributes. By devising algorithms to extract such attributes automatically for different classes of files, storage systems can predict what class a file would belong to early in its lifecycle, e.g., at the creation time. The file valuation, classification, and class membership prediction can then guide a wide range of new optimizations, e.g., data placement across tiered storage and data protection.
Ying Chen, "Information Valuation for Information Lifecycle Management", ICAC, 2005, Proceedings. Second International Conference on Autonomic Computing, Proceedings. Second International Conference on Autonomic Computing 2005, pp. 135-146, doi:10.1109/ICAC.2005.35