The Community for Technology Leaders
20th Annual International Conference on High Performance Computing (2012)
Pune, India India
Dec. 18, 2012 to Dec. 22, 2012
ISBN: 978-1-4673-2372-7
pp: 1-9
William Hendrix , Department of Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Md. Mostofa Ali Patwary , Department of Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Ankit Agrawal , Department of Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Wei-keng Liao , Department of Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Alok Choudhary , Department of Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
ABSTRACT
Hierarchical clustering has many advantages over traditional clustering algorithms like k-means, but it suffers from higher computational costs and a less obvious parallel structure. Thus, in order to scale this technique up to larger datasets, we present SHRINK, a novel shared-memory algorithm for single-linkage hierarchical clustering based on merging the solutions from overlapping sub-problems. In our experiments, we find that SHRINK provides a speedup of 18–20 on 36 cores on both real and synthetic datasets of up to 250,000 points. Source code for SHRINK is available for download on our website, http://cucis.ece.northwestern.edu.
INDEX TERMS
CITATION

W. Hendrix, M. M. Ali Patwary, A. Agrawal, W. Liao and A. Choudhary, "Parallel hierarchical clustering on shared memory platforms," 20th Annual International Conference on High Performance Computing(HIPC), Pune, India India, 2012, pp. 1-9.
doi:10.1109/HiPC.2012.6507511
94 ms
(Ver 3.3 (11022016))