Ben Kao, The University of Hong Kong, Hong Kong
We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UK-means algorithm, which is based on the traditional K-means algorithm. In UK-means, an object is assigned to the cluster whose representative has the smallest expected distance to the object. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration computation. We study various pruning methods to avoid such expensive expected distance calculation.
Citation:
Wang Kay Ngai, Ben Kao, Chun Kit Chui, Reynold Cheng, Michael Chau, Kevin Y. Yip, "Efficient Clustering of Uncertain Data," icdm, pp.436-445, Sixth IEEE International Conference on Data Mining (ICDM'06), 2006