The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2013 vol.25)
pp: 106-118
Jose F. Rodrigues, Jr , Universidade de São Paulo, São Carlos
Hanghang Tong , IBM T.J. Watson Research, Hawthorne
Jia-Yu Pan , Google, Inc., Pittsburgh, Pittsburgh
Agma J.M. Traina , Universidade de São Paulo, São Carlos
Caetano Traina, Jr , Universidade de São Paulo, São Carlos
Christos Faloutsos , Carnegie Mellon University, Pittsburgh
ABSTRACT
Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges. To take advantage of such graphs, one must be able to find patterns, outliers, and communities. These tasks are better performed in an interactive environment, where human expertise can guide the process. For large graphs, though, there are some challenges: the excessive processing requirements are prohibitive, and drawing hundred-thousand nodes results in cluttered images hard to comprehend. To cope with these problems, we propose an innovative framework suited for any kind of tree-like graph visual design. GMine integrates 1) a representation for graphs organized as hierarchies of partitions—the concepts of SuperGraph and Graph-Tree; and 2) a graph summarization methodology—CEPS. Our graph representation deals with the problem of tracing the connection aspects of a graph hierarchy with sub linear complexity, allowing one to grasp the neighborhood of a single node or of a group of nodes in a single click. As a proof of concept, the visual environment of GMine is instantiated as a system in which large graphs can be investigated globally and locally.
INDEX TERMS
Data structures, Visualization, Partitioning algorithms, Communities, Social network services, Computational modeling, Layout, graph visualization, Graph analysis system, graph representation, data structures, graph mining
CITATION
Jose F. Rodrigues, Jr, Hanghang Tong, Jia-Yu Pan, Agma J.M. Traina, Caetano Traina, Jr, Christos Faloutsos, "Large Graph Analysis in the GMine System", IEEE Transactions on Knowledge & Data Engineering, vol.25, no. 1, pp. 106-118, Jan. 2013, doi:10.1109/TKDE.2011.199
REFERENCES
[1] J. Abello, F. van Ham, and N. Krishnan, "Ask-Graphview: A Large Scale Graph Visualization System," IEEE Trans. Visualization and Computer Graphics, vol. 12, no. 5, pp. 669-676, Sept./Oct. 2006.
[2] S.B. Akers, "Binary Decision Diagrams," IEEE Trans. Computers, vol. C-27, no. 6, pp. 509-516, June 1978.
[3] D. Archambault, T. Munzner, and D. Auber, "Grouseflocks: Steerable Exploration of Graph Hierarchy Space," IEEE Trans. Visualization and Computer Graphics, vol. 14, no. 4, pp. 900-913, July/Aug. 2008.
[4] D. Archambault, T. Munzner, and D. Auber, "Tugging Graphs Faster: Efficiently Modifying Path-Preserving Hierarchies for Browsing Paths," IEEE Trans. Visualization and Computer Graphics, vol. 17, no. 3, pp. 276-289, Mar. 2011.
[5] D. Auber, Y. Chiricota, F. Jourdan, and G. Melançon, "Multiscale Visualization of Small World Networks," Proc. IEEE Ninth Conf. Information Visualization (InfoVis), pp. 75-81, 2003.
[6] V. Batagelj, W. Didimo, G. Liotta, P. Palladino, and M. Patrignani, "Visual Analysis of Large Graphs Using (x,y)-Clustering and Hybrid Visualizations," Proc. IEEE Pacific Visualization Symp. (PacificVis), pp. 209-216, 2010.
[7] A.L. Buchsbaum and J.R. Westbrook, "Maintaining Hierarchical Graph Views," Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 566-575, 2000.
[8] E. Dahlhaus, J. Gustedt, and R.M. McConnell, "Efficient and Practical Algorithms for Sequential Modular Decomposition," J. Algorithms, vol. 41, pp. 360-387, 2001.
[9] B.B. Dalvi, M. Kshirsagar, and S. Sudarshan, "Keyword Search on External Memory Data Graphs," Proc. VLDB Endowment, vol. 1, pp. 1189-1204, 2008.
[10] P. Vaz de Melo, L. Akoglu, C. Faloutsos, and A. Loureiro, "Surprising Patterns for the Call Duration Distribution of Mobile Phone Users," Proc. European Conf. Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), pp. 354-369, 2010.
[11] P. Eades and Q. Feng, "Multilevel Visualization of Clustered Graphs," Proc. Symp. Graph Drawing, pp. 101-112, 1997.
[12] P. Eades and M.L. Huang, "Navigating Clustered Graphs Using Force-Directed Methods," Graph Algorithms and Applications, vol. 4, pp. 157-181, 2000.
[13] C. Faloutsos, K.S. McCurley, and A. Tomkins, "Fast Discovery of Connection Subgraphs," Proc. ACM 10th Int'l Conf. Knowledge Discovery and Data Mining (SIGKDD), pp. 118-127, 2004.
[14] I. Finocchi, "Hierarchical Decompositions for Visualizing Large Graphs," PhD thesis, Univ. of Rome, 2002.
[15] S. Fortunato, "Community Detection in Graphs," Physics Reports, vol. 486, pp. 75-174, 2010.
[16] E.R. Gansner, Y. Koren, and S.C. North, "Topological Fisheye Views for Visualizing Large Graphs," IEEE Trans. Visualization and Computer Graphics, vol. 11, no. 4, pp. 457-468, July/Aug. 2005.
[17] R. Gentilini, C. Piazza, and A. Policriti, "Computing Strongly Connected Components in a Linear Number of Symbolic Steps," Proc. Symp. Discrete Algorithms, pp. 573-582, 2003.
[18] M. Gomez-Rodriguez, J. Leskovec, and A. Krause, "Inferring Networks of Diffusion and Influence," Proc. 16th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 1019-1028, 2010.
[19] H. David, "Statecharts: A Visual Formalism for Complex Systems," Science Computer Programming, vol. 8, pp. 231-274, 1987.
[20] F. van Ham and J.J. van Wijk, "Interactive Visualization of Small World Graphs," Proc. IEEE Symp. Information Visualization (InfoVis), pp. 199-206, 2004.
[21] D. Harel and Y. Koren, "Graph Drawing by High-Dimensional Embedding," Proc. Revised Papers from 10th Int'l Symp. Graph Drawing, pp. 207-219, 2002.
[22] M.L. Huang and Q.V. Nguyen, "A Space Efficient Clustered Visualization of Large Graphs," Proc. Fourth Int'l Conf. Image and Graphics, pp. 920-927, 2007.
[23] D.J. Watts, Small Worlds: The Dynamics of Networks between Order and Randomness. Princeton Univ. Press, 2003.
[24] G. Karypis and V. Kumar, "Multilevel Graph Partitioning Schemes," Proc. IEEE/ACM Conf. Parallel Processing, pp. 113-122, 1995.
[25] G. Kasneci, S. Elbassuoni, and G. Weikum, "Ming: Mining Informative Entity Relationship Subgraphs," Proc. 18th ACM Conf. Information and Knowledge Management (IKM), pp. 1653-1656, 2009.
[26] J.F. RodriguesJr., A.J.M. Traina, C. Faloutsos, and C. TrainaJr., "Supergraph Visualization," Proc. IEEE Eighth Int'l Symp. Multimedia (ISM), pp. 227-234, 2006.
[27] J. Pan, H. Yang, C. Faloutsos, and P. Duygulu, "Automatic Multimedia Cross-Modal Correlation Discovery," Proc. ACM Int'l Conf. Knowledge Discovery and Data Mining (SIGKDD), pp. 653-658, 2004.
[28] L. Page, S. Brin, R. Motwani, and T. Winograd, "The Pagerank Citation Ranking: Bringing Order to the Web," technical report, Stanford, 1998.
[29] C.R. Palmer and C. Faloutsos, "Electricity Based External Similarity of Categorical Attributes," Proc. Seventh Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD), pp. 486-500, 2003.
[30] C. Papadopoulos and C. Voglis, "Drawing Graphs Using Modular Decomposition," Proc. 13th Int'l Conf. Graph Drawing, pp. 343-354, 2005.
[31] M. Raitner, Book Efficient Visual Navigation - A Study by the Example of Hierarchically Structured Graphs. VDM Verlag, 2007.
[32] J.F. RodriguesJr., H. Tong, A.J.M. Traina, C. Faloutsos, and J. Leskovec, "GMine: A System for Scalable, Interactive Graph Visualization and Mining," Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB), pp. 1195-1198, 2006.
[33] D. Schaffer, Z. Zuo, S. Greenberg, L. Bartram, J. Dill, S. Dubs, and M. Roseman, "Navigating Hierarchically Clustered Networks through Fisheye and Full-Zoom Methods," ACM Trans. Computer-Human Interaction, vol. 3, pp. 162-188, 1996.
[34] J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos, "Neighborhood Formation and Anomaly Detection in Bipartite Graphs," Proc. IEEE Fifth Int'l Conf. Data Mining (ICDM), pp. 418-425, 2005.
[35] Y. Tian, R.A. Hankins, and J.M. Patel, "Efficient Aggregation for Graph Summarization," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 567-580, 2008.
[36] H. Tong and C. Faloutsos, "Center-Piece Subgraphs: Problem Definition and Fast Solutions," Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD), pp. 404-413, 2006.
[37] J.S. Vitter, "External Memory Algorithms and Data Structures: Dealing with Massive Data," ACM Computing Survey, vol. 33, no. 2, pp. 209-271, 2001.
[38] N. Zhang, Y. Tian, and J.M. Patel, "Discovery-Driven Graph Summarization," Proc. IEEE 26th Int'l Conf. Data Eng. (ICDE), pp. 880-891, 2010.
50 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool