This Article 
 Bibliographic References 
 Add to: 
Attribute-Level Neighbor Hierarchy Construction Using Evolved Pattern-Based Knowledge Induction
July 2006 (vol. 18 no. 7)
pp. 917-929
Neighbor knowledge construction is the foundation for the development of cooperative query answering systems capable of searching for close match or approximate answers when exact match answers are not available. This paper presents a technique for developing neighbor hierarchies at the attribute level. The proposed technique is called the evolved Pattern-based Knowledge Induction (ePKI) technique and allows construction of neighbor hierarchies for nonunique attributes based upon confidences, popularities, and clustering correlations of inferential relationships among attribute values. The technique is applicable for both categorical and numerical (discrete and continuous) attribute values. Attribute value neighbor hierarchies generated by the ePKI technique allow a cooperative query answering system to search for approximate answers by relaxing each individual query condition separately. Consequently, users can search for approximate answers even when the exact match answers do not exist in the database (i.e., searching for existing similar parts as part of the implementation of the concepts of rapid prototyping). Several experiments were conducted to assess the performance of the ePKI in constructing attribute-level neighbor hierarchies. Results indicate that the ePKI technique produces accurate neighbor hierarchies when strong inferential relationships appear among data.

[1] J. Minker, G.A. Wilson, and B.H. Zimmerman, “Query Expansion by the Addition of Clustered Terms for a Document Retrieval System,” Information Storage and Retrieval, vol. 8, pp. 329-348, 1972.
[2] F. Cuppens and R. Demoloube, “Cooperative Answering: A Methodology to Provide Intelligent Access to Database,” Proc. Second Int'l Conf. Expert Database Systems, 1988.
[3] H. Grice, “Logic and Conversation,” Syntax and Semantics, P. Cole and J. Morgan, eds., Academic Press, 1975.
[4] E. Petrakis, G.M. Euripides, and C. Faloutsos, “Similarity Searching in Medical Image Databases,” IEEE Trans. Knowledge and Data Eng., vol. 9, no. 3, pp. 435-447, May/June 1997.
[5] D. Che, K. Aberer, and Y. Chen, “The Design of Query Interfaces to the GPCRDB Biological Database,” Proc. User Interfaces to Data Intensive Systems, pp. 22-31, 1999.
[6] D. Che, C. Yangjun, and A. Karl, “Query System in a Biological Database,” Proc. 11th Int'l Conf. Scientific and Statistical Database Management, pp. 158-167, 1999.
[7] C.T. Yu and W. Sun, “Automatic Knowledge Acquisition and Maintenance for Semantic Query Optimization,” IEEE Trans. Knowledge and Data Eng., vol. 1, pp. 362-375, 1989.
[8] J. Han, Y. Cai, and N. Cercone, “Data-Driven Discovery of Quantitative Rules in Relational Databases,” IEEE Trans. Knowledge and Data Eng., vol. 5, pp. 29-40, 1993.
[9] J. Han and Y. Fu, “Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Database,” Proc. AAAI Workshop Knowledge Discovery in Databases (KDD 94), pp. 157-168, July 1994.
[10] T. Puthpongsiriporn, “Neighbor Hierarchy Construction at the Attribute Level for Cooperative Query Answering,” Proc. 2003 Industrial Eng. Research Conf., May 2003.
[11] M. Merzbacher, “Nearness and Cooperative Query Answering,” unpublished PhD dissertation, Computer Science Dept., Univ. of California, Los Angeles, 1993.
[12] Y. Huang, “Intelligent Query Answering by Knowledge Discovery Techniques,” thesis, Simon Fraser Univ., Canada, 1993.
[13] J. Ozawa and K. Yamada, “Discovery of Global Knowledge in a Database for Cooperative Answering,” Proc. Joint Conf. Fourth IEEE Int'l Conf. Fuzzy Systems and Second Int'l Fuzzy Eng. Symp., 1995.
[14] C.E. Shannon and W. Weave, The Mathematical Theory of Communication. Urbana, Ill.: Univ. of Illinois Press, 1964.
[15] D.K.Y. Chiu, A.K.C. Wong, and B. Cheung, “Information Discovery through Hierarchical Maximum Entropy Discretization and Synthesis,” Knowledge Discovery in Databases, AAAI Press/The MIT Press, 1991.
[16] A.K.C. Wong and D.K.Y. Chiu, “Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 6, pp. 796-803, 1987.
[17] J. Genari, P. Langley, and D. Fisher, “Models of Incremental Concept Formation,” Artificial Intelligence, vol. 40, pp. 11-62, 1987.
[18] W. Chu and K. Chiang, “Abstraction of High Level Concepts from Numerical Values in Databases,” Proc. AAAI Workshop Knowledge Discovery in Database, pp. 133-144, 1994.
[19] W.W. Chu, K. Chiang, C.C. Hsu, and H. Yau, “An Error-Based Conceptual Clustering Method for Providing Approximate Query Answers,” Comm. ACM, vol. 39, no. 12, pp. 216-230, Dec. 1996.
[20] M. Merzbacher and W. Chu, “Pattern-Based Clustering for Database Attribute Values,” Proc. AAAI Workshop Knowledge Discovery in Database, 1993.
[21] S.M. Ali and S.D. Silvey, “A General Class of Coefficient of Divergence of One Distribution from Another,” J. Royal Statistical Soc., Series B, vol. 2, pp. 131-142, 1966.
[22] I. Csiszár, “Information-Type Measures of Difference of Probability Distributions and Indirect Observatons,” Studia Scient. Math. Hung., vol. 2, pp. 299-318, 1967.
[23] J. Lin, “Divergence Measures Based on the Shannon Entropy,” IEEE Trans. Information Theory, vol. 37, no. 1, pp. 145-151, 1991.
[24] S. Kullback and R.A. Leibler, “On Information and Sufficiency,” Annals of Math. Statistics, vol. 22, pp. 76-86, 1951.
[25] L. Withers, “Some Inequalities Relating Different Measures of Divergence between Two Probability Distributions,” IEEE Trans. Information Theory, vol. 45, no. 5, pp. 1728-1735, 1999.
[26] D. Malerba, F. Esposito, and M. Monopoli, “Comparing Dissimilarity Measures for Probabilistic Symbolic Objects,” Data Mining III, Series Management Information Systems, vol 6, pp. 31-40, 2002.
[27] N. Singh, Systems Approach to Computer-Integrated Design and Manufacturing. John Wiley and Sons, 1996.
[28] M.P. Chandrasekharan and R. Rajagopalan, “ZODIAC— An Algorithm for Concurrent Formation of Part-Families and Machine-Cells,” Int'l J. Production Research, vol. 25, no. 6, pp. 835-850, 1987.
[29] M.P. Chandrasekharan and R. Rajagopalan, “Groupability: An Analysis of the Properties of Binary Data Matrices for Group Technology,” Int'l J. Production Research, vol. 27, no. 6, pp. 1035-1052, 1989.

Index Terms:
Approximate query answering, clustering, knowledge discovery, query-answering systems, similarity measures.
Thanit Puthpongsiriporn, J. David Porter, Bopaya Bidanda, Ming-En Wang, Richard E. Billo, "Attribute-Level Neighbor Hierarchy Construction Using Evolved Pattern-Based Knowledge Induction," IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 7, pp. 917-929, July 2006, doi:10.1109/TKDE.2006.104
Usage of this product signifies your acceptance of the Terms of Use.