This Article 
 Bibliographic References 
 Add to: 
Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph
November 2006 (vol. 28 no. 11)
pp. 1875-1881
We propose a fast agglomerative clustering method using an approximate nearest neighbor graph for reducing the number of distance calculations. The time complexity of the algorithm is improved from {\rm O}(\tau N^2) to {\rm O}(\tau N \log N) at the cost of a slight increase in distortion; here, \tau denotes the number of nearest neighbor updates required at each iteration. According to the experiments, a relatively small neighborhood size is sufficient to maintain the quality close to that of the full search.

[1] J.H. Ward, “Hierarchical Grouping to Optimize an Objective Function,” J.Am. Statisical Assoc., vol. 58, pp. 236-244, 1963.
[2] J. Shanbehzadeh and P.O. Ogunbona, “On the Computational Complexity of the LBG and PNN Algorithms,” IEEE Trans. Image Processing, vol. 6, no. 4, pp. 614-616, Apr. 1997.
[3] P. Fränti, T. Kaukoranta, D.-F. Shen, and K.-S. Chang, “Fast and Memory Efficient Implementation of the Exact PNN,” IEEE Trans. Image Processing, vol. 9, no. 5, pp. 773-777, May 2000.
[4] J.C. Gover and G.J.S. Ross, “Minimum Spanning Trees and Single Linkage Cluster Analysis,” Applied Statistics, vol. 18, pp. 54-64, 1969.
[5] S. Bandyopadhyay, “An Automatic Shape Independent Clustering Technique,” Pattern Recognition, vol. 37, no. 1, pp. 33-45, Jan. 2004.
[6] P.H.A. Sneath, “The Application of Computers to Taxonomy,” J. General Microbiology, vol. 17, no. 1, pp. 210-226, Aug. 1957.
[7] G. Karypis, E. Han, and V. Kumar, “CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling,” Computer, vol. 32, no. 8, pp. 66-75, Aug. 1999.
[8] D. Harel and Y. Koren, “Clustering Spatial Data Using Random Walks,” Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '01), pp. 281-286, Aug. 2001.
[9] J.H. Friedman, J.L. Bentley, and R.A. Finkel, “An Algorithm for Finding Best Matches in Logarithmic Expected Time,” ACM Trans. Math. Software, vol. 3, no. 3, pp. 209-226, Sept. 1977.
[10] O. Virmajoki and P. Fränti, “Divide-and-Conquer Algorithm for Creating Neighborhood Graph for Clustering,” Proc. Int'l Conf. Pattern Recognition (ICPR '04), vol. 1, pp. 264-267, Aug. 2004.
[11] S.-W. Ra and J.K. Kim, “A Fast Mean-Distance-Ordered Partial Codebook Search Algorithm for Image Vector Quantization,” IEEE Trans. Circuits and Systems, vol. 40, no. 9, pp. 576-579, Sept. 1993.
[12] W.H. Equitz, “A New Vector Quantization Clustering Algorithm,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 37, no. 10, pp. 1568-1575, Oct. 1989.
[13] T. Kurita, “An Efficient Agglomerative Clustering Algorithm Using a Heap,” Pattern Recognition, vol. 24, no. 3, pp. 205-209, Mar. 1991.
[14] T. Kaukoranta, P. Fränti, and O. Nevalainen, “Vector Quantization by Lazy Pairwise Nearest Neighbor Method,” Optical Eng., vol. 38, no. 11, pp. 1862-1868, Nov. 1999.
[15] O. Virmajoki, P. Fränti, and T. Kaukoranta, “Practical Methods for Speeding-Up the Pairwise Nearest Neighbor Method,” Optical Eng., vol. 40, no. 11, pp. 2495-2504, Nov. 2001.
[16] O. Virmajoki and P. Fränti, “Fast Pairwise Nearest Neighbor Based Algorithm for Multilevel Thresholding,” J. Electronic Imaging, vol. 12, no. 4, pp. 648-659, Oct. 2003.
[17] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms. Cambridge, Mass.: MIT Press, 1998.
[18] G.L. Miller, S-H. Teng, W. Thurston, and S.A. Vavaris, “Separators for Sphere-Packings and Nearest Neighbor Graphs,” J. ACM, vol. 44. no. 1, pp.1-29, Jan. 1997.
[19] J.H. Conway and N.J.A. Sloane, Sphere Packings, Lattices and Groups. New York: Springer-Verlag, 1998.
[20] R. Sproull, “Refinements to Nearest-Neighbor Searching in k-d Trees,” Algorithmica, vol. 6, pp. 579-589, 1991.
[21] P. Fränti, O. Virmajoki, and V. Hautamäki, “Fast PNN-Based Clustering Using k Nearest Neighbor Graph,” Proc. IEEE Int'l Conf. Data Mining (ICDM '03), pp. 525-528, Nov. 2003.
[22] Encyclopedia of Statistical Sciences, vol. 6, S. Kotz, N.L. Johnson, and C.B.Read, eds. New York: John Wiley Sons, 1985.
[23] T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: A New Data Clustering Algorithm and Its Applications,” Data Mining and Knowledge Discovery, vol. 1, no. 2, pp. 141-182, June 1997.
[24] Y. Linde, A. Buzo, and R.M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. Comm., vol. 28, no. 1, pp. 84-95, Jan. 1980.
[25] T. Kaukoranta, P. Fränti, and O. Nevalainen, “A Fast Exact GLA Based on Code Vector Activity Detection,” IEEE Trans. Image Processing, vol. 9, no. 8, pp. 1337-1342, Aug. 2000.

Index Terms:
Clustering, agglomeration, nearest neighbor, vector quantization, PNN.
Pasi Fr?nti, Olli Virmajoki, Ville Hautam?ki, "Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1875-1881, Nov. 2006, doi:10.1109/TPAMI.2006.227
Usage of this product signifies your acceptance of the Terms of Use.