International Conference on Computing: Theory and Applications (ICCTA'07) Scalable Clustering with smoka Kolkata, India March 05-March 07 ISBN: 0-7695-2770-1
The paper reports a multi-step clustering procedure equipped with a divergence (a distance like function derived from a convex function). The first step of the procedure is a BIRCH like algorithm capable to convert very large datasets to "summaries" that require much less computer memory. The second step is the Principal Direction Divisive Partitioning algorithm (PDDP) that partitions the set of "summaries" into k clusters. This partition is the input for a smoothed k-means based clustering algorithm (smoka). The final partition of "summaries" generated by smoka induces a partition of the original dataset. Preliminary numerical experiments with text collections reported in the paper demonstrate smoka's remarkable accuracy and speed of convergence.
Citation:
Jacob Kogan, "Scalable Clustering with smoka," iccta, pp.299-303, International Conference on Computing: Theory and Applications (ICCTA'07), 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||