Issue No. 02 - February (2006 vol. 18)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2006.23
Data mining uncovers hidden, previously unknown, and potentially useful information from large amounts of data. Compared to the traditional statistical and machine learning data analysis techniques, data mining emphasizes providing a convenient and complete environment for the data analysis. In this paper, we propose an integrated framework for visualized, exploratory data clustering, and pattern extraction from mixed data. We further discuss its implementation techniques: a generalized self-organizing map (GSOM) and an extended attribute-oriented induction (EAOI), which not only overcome the drawbacks of their original algorithms, but also provide additional analysis capabilities. Specifically, the GSOM facilitates the direct handling of mixed data, including categorical and numeric values. The EAOI enables exploration for major values hidden in the data and, in addition, offers an alternative for processing numeric attributes, instead of generalizing them. A prototype was developed for experiments with synthetic and real data sets, and comparison with those of the traditional approaches. The results confirmed the feasibility of the framework and the superiority of the extended techniques.
Index Terms- Attribute-oriented induction, clustering, data mining, pattern discovery, self-organizing map.
C. Hsu and S. Wang, "An Integrated Framework for Visualized and Exploratory Pattern Discovery in Mixed Data," in IEEE Transactions on Knowledge & Data Engineering, vol. 18, no. , pp. 161-173, 2006.