2007 IEEE 23rd International Conference on Data Engineering (2007)
Apr. 15, 2007 to Apr. 20, 2007
Zong-Xian Yin , Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, E-mail: firstname.lastname@example.org
Jung-Hsien Chiang , Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, E-mail: email@example.com
This paper proposes a new clustering algorithm referred to as the Possibilitic Latent Variables (PLV) clustering algorithm. This algorithm provides a powerful tool for the analysis of complex data, such as clinical diagnosis and biological expressions data, due to its robustness to various data distributions and its accuracy in establishing appropriate groups from data. The algorithm combines a distribution model and the fuzzy degrees concept. Compared to the expectation-maximization (EM) algorithm, which is a well-known distribution estimating algorithm, the PLV algorithm has the considerable advantage that it can be applied to various data types, i.e. it is not restricted solely to Gaussian data distributions. Additionally, the proposed algorithm has a better performance than the well-known fuzzy clustering algorithm, i.e. the FCM algorithm, where it can address compact regions, other than simply dividing objects into several equal populations. The performance of the proposed algorithm is verified by conducting clustering tasks on the contents of several medical diagnosis and biological expressions datasets.
Z. Yin and J. Chiang, "Patterns Discovery on Complex Diagnosis and Biological Data Using Fuzzy Latent Variables," 2007 IEEE 23rd International Conference on Data Engineering(ICDE), Istanbul, Turkey, 2007, pp. 576-585.