2015 International Conference on Big Data and Smart Computing (BigComp) (2015)

Jeju, South Korea

Feb. 9, 2015 to Feb. 11, 2015

ISBN: 978-1-4799-7303-3

pp: 186-193

Yongsub Lim , Department of Computer Science, KAIST

Won-Jo Lee , Department of Computer Science, KAIST

Ho-Jin Choi , Department of Computer Science, KAIST

U Kang , Department of Computer Science, KAIST

ABSTRACT

Given a real world graph, how can we find a large subgraph whose partition quality is much better than the original? Graph partitioning has received great attentions in graph mining, and especially balanced graph partitioning is required in many real world applications. However, the balanced graph partitioning is known to be NP-hard, and moreover it is known that there is no good cut at a large scale for real graphs. Due to this difficulty, in this paper, we propose a new paradigm for graph partitioning. Instead of dealing with the whole graph, our focus is on finding a large subgraph with high quality partitions, in terms of conductance. We show that removing problematic nodes, i.e. large degree nodes called hub nodes in real graphs, remarkably decreases conductance for the remaining giant connected component (GCC), while the number of nodes in the GCC is still significant. In experiments, we demonstrate that our method finds a subgraph of quite a large size with low conductance graph partitions, compared with competing methods. We also show that the competitors cannot find connected subgraphs while our method does, by construction. This improvement in partition quality for the subgraph is especially noticeable for large scale cuts — for a balanced partition, down to 14% of the original conductance with GCC size 70% of the total. As a result, the found subgraph has clear partitions at almost all scales compared with the original, and this result especially helps find communities which are well-formed, but hidden by hubs at various scales in real world graphs like social networks.

INDEX TERMS

Communities, Partitioning algorithms, Social network services, Measurement, Time complexity, Image edge detection

CITATION

Y. Lim, W. Lee, H. Choi and U. Kang, "Discovering large subsets with high quality partitions in real world graphs,"

*2015 International Conference on Big Data and Smart Computing (BigComp)(BIGCOMP)*, Jeju, South Korea, 2015, pp. 186-193.

doi:10.1109/35021BIGCOMP.2015.7072830

CITATIONS