Issue No. 12 - Dec. (2015 vol. 27)
Lei Shi , State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China
Hanghang Tong , School of Computing, Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
Jie Tang , Department of Computer Science and Technology, Tsinghua University, Beijing, China
Chuang Lin , Department of Computer Science and Technology, Tsinghua University, Beijing, China
Visually analyzing citation networks poses challenges to many fields of the data mining research. How can we summarize a large citation graph according to the user’s interest? In particular, how can we illustrate the impact of a highly influential paper through the summarization? Can we maintain the sensory node-link graph structure while revealing the flow-based influence patterns and preserving a fine readability? The state-of-the-art influence maximization algorithms can detect the most influential node in a citation network, but fail to summarize a graph structure to account for its influence. On the other hand, existing graph summarization methods fold large graphs into clustered views, but can not reveal the hidden influence patterns underneath the citation network. In this paper, we first formally define the Influence Graph Summarization problem on citation networks. Second, we propose a matrix decomposition based algorithm pipeline to solve the IGS problem. Our method can not only highlight the flow-based influence patterns, but also easily extend to support the rich attribute information. A prototype system called VEGAS implementing this pipeline is also developed. Third, we present a theoretical analysis on our main algorithm, which is equivalent to the kernel k-mean clustering. It can be proved that the matrix decomposition based algorithm can approximate the objective of the proposed IGS problem. Last, we conduct comprehensive experiments with real-world citation networks to compare the proposed algorithm with classical graph summarization methods. Evaluation results demonstrate that our method significantly outperforms the previous ones in optimizing both the quantitative IGS objective and the quality of the visual summarizations.
Matrix decomposition, Clustering algorithms, Algorithm design and analysis, Pipelines, Visualization, Approximation algorithms, Kernel
L. Shi, H. Tong, J. Tang and C. Lin, "VEGAS: Visual influEnce GrAph Summarization on Citation Networks," in IEEE Transactions on Knowledge & Data Engineering, vol. 27, no. 12, pp. 3417-3431, 2015.