This Article 
 Bibliographic References 
 Add to: 
A Distributed Approach to Node Clustering in Decentralized Peer-to-Peer Networks
September 2005 (vol. 16 no. 9)
pp. 814-829

Abstract—Connectivity-based node clustering has wide-ranging applications in decentralized peer-to-peer (P2P) networks such as P2P file sharing systems, mobile ad-hoc networks, P2P sensor networks, and so forth. This paper describes a Connectivity-based Distributed Node Clustering scheme (CDC). This scheme presents a scalable and efficient solution for discovering connectivity-based clusters in peer networks. In contrast to centralized graph clustering algorithms, the CDC scheme is completely decentralized and it only assumes the knowledge of neighbor nodes instead of requiring a global knowledge of the network (graph) to be available. An important feature of the CDC scheme is its ability to cluster the entire network automatically or to discover clusters around a given set of nodes. To cope with the typical dynamics of P2P networks, we provide mechanisms to allow new nodes to be incorporated into appropriate existing clusters and to gracefully handle the departure of nodes in the clusters. These mechanisms enable the CDC scheme to be extensible and adaptable in the sense that the clustering structure of the network adjusts automatically as nodes join or leave the system. We provide detailed experimental evaluations of the CDC scheme, addressing its effectiveness in discovering good quality clusters and handling the node dynamics. We further study the types of topologies that can benefit best from the connectivity-based distributed clustering algorithms like CDC. Our experiments show that utilizing message-based connectivity structure can considerably reduce the messaging cost and provide better utilization of resources, which in turn improves the quality of service of the applications executing over decentralized peer-to-peer networks.

[1] K.M. Alzoubi, P.-J. Wan, and O. Frieder, “Message-Optimal Connected Dominating Sets in Mobile Ad-Hoc Networks,” Proc. ACM MobiHoc, 2002.
[2] A.D. Amis, R. Prakash, D. Huynh, and T. Vuong, “Max-Min D-Cluster Formation in Wireless Ad Hoc Networks,” Proc. IEEE INFOCOM, 2000.
[3] S. Bandyopadhyay and E.J. Coyle, “An Energy Efficient Hierarchical Clustering Algorithm for Wireless Sensor Networks,” Proc. IEEE INFOCOM, 2003.
[4] S. Basagni, “Distributed Clustering for Ad Hoc Networks,” I-SPAN, 1999.
[5] C. Bettstetter and R. Krausser, “Scenario-Based Stability Anlysis of the Distributed Mobility-Adaptive Clustering (DMAC) Algorithm,” Proc. ACM MobiHoc, 2001.
[6] M. Chatterjee, S. Das, and D. Turgut, “WCA: A Weighted Clustering Algorithm for Mobile Ad Hoc Networks,” J. Cluster Computing, vol. 5, Apr. 2002.
[7] Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, and S. Shenker, “Making Gnutella-Like P2P Systems Scalable,” Proc. ACM SIGCOMM, 2003.
[8] G. Chen, F.G. Nocetti, J.S. Gonzalez, and I. Stojmenovic, “Connectivity Based k-Hop Clustering in Wireless Networks,” Proc. Hawaii Int'l Conf. System Sciences, 2002.
[9] G. Chen and I. Stojmenovic, “Clustering and Routing in Mobile Wireless Networks,” Technical Report TR-99-05, School of Information Technology and Eng., Univ. of Ottawa, June 1999.
[10] Y. Chen and A. Liestman, “Approximating Minimum Size Weakly-Connected Dominating Sets for Clustering Mobile Ad-Hoc Networks,” Proc. ACM MobiHoc, 2002.
[11] P. Drineas, A. Frieze, R. Kannan, S. Vempala, and V. Vinay, “Clustering in Large Graphs and Matrices,” Proc. SODA: ACM-SIAM Symp. Discrete Algorithms, 1999.
[12] D. Dubhashi, A. Mei, A. Panconesi, J. Radhakrishnan, and A. Srinivasan, “Fast Distributed Algorithms for (Weakly) Connected Dominating Sets and Linear-Size Skeletons,” Proc. ACM SODA, 2003.
[13] J. Falkner, F. Rendl, and H. Wolkowitz, “A Computational Study of Graph Partitioning,” Math. Programming, vol. 66, no. 2, pp. 211-239, 1994.
[14] M. Faloutsos, P. Faloutsos, and C. Faloutsos, “On Power-Law Relationships of the Internet Topology,” Proc. ACM SIGCOMM, 1999.
[15] Freenet home page, http:/, 2005.
[16] B. Gedik and L. Liu, “PeerCQ: A Decentralized and Self-Configuring Peer-to-Peer Information Monitoring System,” Proc. IEEE Int'l Conf. Distributed Computing Systems, 2003.
[17] Gnutella development page, http:/, 2005.
[18] K.P. Gummadi, R.J. Dunn, S. Saroiu, S.D. Gribble, H.M. Levy, and J. Zahorjan, “Measurement, Modeling and Analysis of a Peer-to-Peer File Sharing Workload,” Proc. ACM Symp. Operating Systems Principles, 2003.
[19] C. Ho, K. Obraczka, G. Tsudik, and K. Viswanath, “Flooding for Reliable Multicast in Multi-Hop Ad Hoc Networks,” Proc. Workshop Discrete Algorithms and Methods for Mobile Computing and Comm., 1999.
[20] A.K. Jain, M.N. Murthy, and P.J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys, vol. 31, no. 3, 1999.
[21] Kazaa home page, http:/, 2005.
[22] P. Krishna, N. Vaidya, M. Chatterjee, and D. Pradhan, “A Cluster-Based Approach for Routing in Dynamic Networks,” Proc. ACM SIGCOMM Computer Comm. Rev., pp. 49-65, Apr. 1997.
[23] C.R. Lin and M. Gerla, “Adaptive Clustering for Mobile Wireless Networks,” IEEE J. Selected Areas in Comm., vol. 15, no. 7, 1997.
[24] L. Lovasz, “Random Walks on Graphs: A Survey,” Combinatorics, Paul Erdos Is Eighty, vol. 2, 1996.
[25] S. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, “Tag: A Tiny Aggregation Service for Ad Hoc Sensor Networks,” Proc. Symp. Operating Systems Design and Implementation, 2002.
[26] S.R. Madden, R. Szewczyk, M.J. Franklin, and D. Culler, “Supporting Aggregate Queries over Ad-Hoc Wireless Sensor Networks,” Proc. Workshop Mobile Computing and Systems Applications, 2002.
[27] L. Page, S. Brin, R. Motwani, and T. Winograd, “The Pagerank Citation Ranking: Bringing Order to the Web,” technical report, Stanford Digital Library Technologies Project, 1998.
[28] W. Peng and X. Lu, “Efficient Broadcast in Mobile Ad Hoc Networks Using Connected Dominating Sets,” J. Software, 1999.
[29] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “A Scalable Content-Addressable Network,” Proc. ACM SIGCOMM, 2001.
[30] M. Ripeanu, “Peer-to-Peer Architecture Case Study: Gnutella Network,” Proc. Int'l Conf. Peer-to-Peer Computing, 2001.
[31] T. Roxborough and A. Sen, “Graph Clustering Using Multiway Ratio Cut,” Proc. Int'l Conf. Graph Drawing, 1997.
[32] R. Sablowski and A. Frick, “Automatic Graph Clustering,” Proc. Int'l Conf. Graph Drawing, 1996.
[33] S. Singh, M. Woo, and C.S. Raghavendra, “Power-Aware Routing in Mobile Ad Hoc Networks,” Proc. ACM MobiCom, 1998.
[34] I. Stoica, R. Morris, D. Karger, M.F. Kaashoek, and H. Balakrishnan, “Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications,” Proc. ACM SIGCOMM, 2001.
[35] S. van Dongen, “A New Cluster Algorithm for Graphs,” Centrum voor Wiskunde en Informatica (CWI), INS-R9814, ISSN 1386-3681, 1998.
[36] S. van Dongen, “Performance Criteria for Graph Clustering and Markov Cluster Experiments,” technical report, Nat'l Research Inst. for Math. and Computer Science in the Netherlands, Amsterdam, 2000.
[37] P. Wei and L. Xi-Cheng, “On the Reduction of Broadcast Redundancy in Mobile Ad Hoc Networks,” Proc. Workshop Mobile Ad Hoc Network Computing, 2000.
[38] B. Williams and T. Camp, “Comparison of Broadcasting Techniques for Mobile Ad Hoc Networks,” Proc. ACM MobiHoc, 2002.
[39] J. Wu and H. Li, “A Dominating-Set-Based Routing in Ad Hoc Wireless Networks,” Telecomm. Systems, vol. 18, nos. 1-3, 2001.
[40] B. Yang and H. Garcia-Molina, “Comparing Hybrid Peer-to-Peer Systems,” Proc. Conf. Very Large Data Bases, 2001.
[41] Y. Yao and J.E. Gehrke, “Query Processing in Sensor Networks,” Proc. Conf. Innovative Data Systems Research, 2003.
[42] W. Ye, J. Heidemann, and D. Estrin, “An Energy Efficient MAC Protocol for Wireless Sensor Networks,” Proc. IEEE INFOCOM, 2002.

Index Terms:
Distributed node clustering, connectivity-based graph clustering, peer-to-peer networks, decentralized network management.
Lakshmish Ramaswamy, Bugra Gedik, Ling Liu, "A Distributed Approach to Node Clustering in Decentralized Peer-to-Peer Networks," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 9, pp. 814-829, Sept. 2005, doi:10.1109/TPDS.2005.101
Usage of this product signifies your acceptance of the Terms of Use.