This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Generality-Based Conceptual Clustering with Probabilistic Concepts
February 2001 (vol. 23 no. 2)
pp. 196-206

Abstract—Statistical research in clustering has almost universally focused on data sets described by continuous features and its methods are difficult to apply to tasks involving symbolic features. In addition, these methods are seldom concerned with helping the user in interpreting the results obtained. Machine learning researchers have developed conceptual clustering methods aimed at solving these problems. Following a long term tradition in AI, early conceptual clustering implementations employed logic as the mechanism of concept representation. However, logical representations have been criticized for constraining the resulting cluster structures to be described by necessary and sufficient conditions. An alternative are probabilistic concepts which associate a probability or weight with each property of the concept definition. In this paper, we propose a symbolic hierarchical clustering model that makes use of probabilistic representations and extends the traditional ideas of specificity-generality typically found in machine learning. We propose a parameterized measure that allows users to specify both the number of levels and the degree of generality of each level. By providing some feedback to the user about the balance of the generality of the concepts created at each level and given the intuitive behavior of the user parameter, the system improves user interaction in the clustering process.

[1] C. Blake, E. Keogh, and C.J. Merz, “UCI Repository of Machine Learning Databases,” Univ. of California, Irvine, Dept. of Information and Computer Sciences, 1998. http://www.ics.uci.edu/~mlearnMLRepository.html .
[2] D. Fisher, L. Xu, J. Carnes, Y. Reich, S. Fenves, J. Chen, R. Shiavi, G. Biswas, and J. Weinberg, “Applying AI Clustering to Engineering Tasks,” IEEE Expert, vol. 8, pp. 51-60, 1993.
[3] D.H. Fisher, “Knowledge Acquisition via Incremental Conceptual Clustering,” Machine Learning, no. 2, pp. 139-172, 1987.
[4] D.H. Fisher, “Iterative Optimization and Simplification of Hierarchical Clusterings,” J. Artificial Intelligence Research, vol. 4, pp. 147-179, 1996.
[5] D.H. Fisher and M.J. Pazzani, “Computational Models of Concept Learning,” Concept Formation: Knowledge and Experience in Unsupervised Learning, D.H. Fisher, M.J. Pazzani, and P. Langley, eds. pp. 3-43, San Mateo, Calif.: Morgan Kauffmann, 1991.
[6] S.J. Hanson and M. Bauer, “Conceptual Clustering, Categorization, and Polymorphy,” Machine Learning, no. 3, pp. 343-372, 1989.
[7] W. Iba and P. Langley, “Unsupervised Learning of Probabilistic Concept Hierarchies,” technical report, Inst. for Study of Learning and Expertise, Palo Alto, Calif., 1999.
[8] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[9] M. Lebowitz, “Experiments with Incremental Concept Formation: UNIMEM,” Machine Learning, vol. 2, pp. 103-138, 1987.
[10] R.S. Michalski and R.E. Stepp, “Learning from Observation: Conceptual Clustering,” Machine Learning: An Artificial Intelligence Approach, R.S. Michalski, J.G. Carbonell, and T.M. Mitchell, eds, pp. 331-363, San Mateo, Calif.: Morgan Kauffmann, 1983.
[11] T.M. Mitchell, “Generalization as Search,” Artificial Intelligence, vol. 18, pp. 203-226, 1982.
[12] G.L. Murphy and E.E. Smith, “Basic Level Superiority in Picture Categorization,” J. Verbal Learning and Verbal Behavior, vol. 21, pp. 1-20, 1982.
[13] Y. Reich and S.V. Barai, “Evaluating Machine Learning Models for Engineering Problems,” Artificial Intelligence in Eng., vol. 13, no. 3, pp. 257-272, 1999.
[14] Y. Reich and S. Fenves, “The Formation and Use of Abstract Concepts in Design,” Concept Formation: Knowledge and Experience in Unsupervised Learning, D.H. Fisher, M.J. Pazzani, and P. Langley, eds., pp. 323-353, San Mateo, Calif.: Morgan Kauffmann, 1991.
[15] E.E. Smith and D.L. Medin, Categories and Concepts. Cambridge, Mass.: Harvard Univ. Press, 1981.

Index Terms:
Conceptual clustering, hierarchical clustering, probabilistic concepts, user interaction.
Citation:
Luis Talavera, Javier Béjar, "Generality-Based Conceptual Clustering with Probabilistic Concepts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 196-206, Feb. 2001, doi:10.1109/34.908969
Usage of this product signifies your acceptance of the Terms of Use.