loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Third IEEE International Conference on Data Mining (ICDM'03)
Tree-structured Partitioning Based on Splitting Histograms of Distances
Melbourne, Florida
November 19-November 22
ISBN: 0-7695-1978-4
Longin Jan Latecki, Temple University, Philadelphia, PA
Rajagopal Venugopal, Temple University, Philadelphia, PA
Marc Sobel, Temple University, Philadelphia, PA
Steve Horvath, Univ. of California, Los Angeles, CA
We propose a novel clustering algorithm that is similar in spirit to classification trees. The data is recursively split using a criterion that applies a discrete curve evolution method to the histogram of distances. The algorithm can be depicted through tree diagrams with triple splits. Leaf nodes represent either clusters or sets of observations that can not yet be clearly assigned to a cluster. After constructing the tree, unclassified data points are mapped to their closest clusters. The algorithm has several advantages. First, it deals effectively with observations that can not be unambiguously assigned to a cluster by allowing a "margin of error". Second, it automatically determines the number of clusters; apart from the margin of error the user only needs to specify the minimal cluster size but not the number of clusters. Third, it is linear with respect to the number of data points and thus suitable for very large data sets. Experiments involving both simulated and real data from different domains show that the proposed method is effective and efficient.
Citation:
Longin Jan Latecki, Rajagopal Venugopal, Marc Sobel, Steve Horvath, "Tree-structured Partitioning Based on Splitting Histograms of Distances," icdm, pp.577, Third IEEE International Conference on Data Mining (ICDM'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.