This Article 
 Bibliographic References 
 Add to: 
Improving Performance of Similarity-Based Clustering by Feature Weight Learning
April 2002 (vol. 24 no. 4)
pp. 556-561

Similarity-based clustering is a simple but powerful technique which usually results in a clustering graph for a partitioning of threshold values in the unit interval. The guiding principle of similarity-based clustering is “similar objects are grouped in the same cluster.” To judge whether two objects are similar, a similarity measure must be given in advance. The similarity measure presented in this paper is determined in terms of the weighted distance between the features of the objects. Thus, the clustering graph and its performance (which is described by several evaluation indices defined in this paper) will depend on the feature weights. This paper shows that, by using gradient descent technique to learn the feature weights, the clustering performance can be significantly improved. It is also shown that our method helps to reduce the uncertainty (fuzziness and nonspecificity) of the similarity matrix. This enhances the quality of the similarity-based decision making.

[1] A. Baraldi and P. Blonda, A Survey of Fuzzy Clustering Algorithms for Pattern Recognition - Part I and Part II IEEE Trans. Systems, Man, and Cybernetics - Part B: Cybernetics, vol. 29, no. 6, pp. 778-801, Dec. 1999.
[2] J. Basak, R.K. De, and S.K. Pal, Unsupervised Feature Selection Using Neuro-Fuzzy Approach Pattern Recognition Letters, vol. 19, pp. 997-1006, 1998.
[3] J. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum, 1981.
[4] A. De Luca and S. Termini, “A Definition of a Nonprobabilistic Entropy in the Setting of Fuzzy Set Theory,” Information and Control, vol. 20, pp. 301-312, 1972.
[5] D. Dubois, Fuzzy Sets and Systems: Theory and Applications. New York, Boston: Academic Press, 1980.
[6] D. Dubois and H. Prade, “Three Semantics of Fuzzy Sets,” Fuzzy Sets and Systems, vol. 90, pp. 141-150, 1997.
[7] R. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Ann. Eugenics, vol. 7, pp. 179-188, 1936.
[8] M. Higashi and G.J. Klir, “Measures on Uncertainty and Information Based on Possibility Distribution,” Int'l J. General Systems, vol. 9, pp. 43-58, 1983.
[9] A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. Englewood Cliffs, N.J.: Prentice Hall, 1988.
[10] G. J. Klir and T. A. Folger,Fuzzy sets, Uncertainty, and Information. Englewood Cliffs, NJ: Prentice-Hall, 1988.
[11] K. Nozaki, H. Ishibuchi, and H. Tanaka, “A Simple but Powerful Heuristic Method for Generating Fuzzy Rules from Numerical Data,” Fuzzy Sets and Systems, vol. 86, pp. 251-270, 1997.
[12] S.S. Rao, Optimization Theory and Applications. Wiley Eastern Limited, 1985.
[13] R. Duda and P. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[14] UCI Repository of Machine Learning Databases and Domain Theories. FTP address: / . 1995.
[15] D.S. Yeung and E.C.C. Tsang, “Weighted Fuzzy Production Rules,” Fuzzy Sets and Systems, vol. 88, pp. 299-313, 1997.
[16] L.A. Zadeh, “Similarity Relations and Fuzzy Orderings,” Information Science, vol. 3, pp. 177-200, 1971.

Index Terms:
clustering, similarity-based clustering, transitive closure, fuzziness and nonspecificity, gradient-descent technique
D.S. Yeung, X.Z. Wang, "Improving Performance of Similarity-Based Clustering by Feature Weight Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 556-561, April 2002, doi:10.1109/34.993562
Usage of this product signifies your acceptance of the Terms of Use.