11th International Database Engineering and Applications Symposium (IDEAS 2007) An EffectiveMulti-Layer Model for Controlling the Quality of Data Banff, Alberta, Canada September 06-September 08 ISBN: 0-7695-2947-X
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/IDEAS.2007.12
Data mining aims to search for implicit, previously unknown, and potentially useful information that might be embedded in the data. It is well known that "garbage in, garbage out". Hence, to get meaningful mining results, a clean set of data is essential. In this paper, we propose an effective model for controlling the quality of data. Specifically, this three-layer model focuses on data validity and data consistency. To elaborate, the internal layer ensures that the observed data are valid and their values fall within reasonable ranges. The temporal layer ensures that data are consistent with their temporal behaviour. The spatial layer ensures that data are consistent with their spatial neighbours. A case study on applying our proposed model to real-life weather data for an agricultural application shows that our model is effective in controlling and improving data quality, and thus leading to better mining results. It is important to note the application of our proposed model is not confined to the weather data for agricultural applications. We also discuss, in this paper, how the proposed three-layer model can be effectively applicable to control the quality of data in some other real-life situations.
Citation:
Carson Kai-Sang Leung, Mark Anthony F. Mateo, Andrew J. Nadler, "An EffectiveMulti-Layer Model for Controlling the Quality of Data," ideas, pp.28-36, 11th International Database Engineering and Applications Symposium (IDEAS 2007), 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||