The Community for Technology Leaders
2012 IEEE 24th International Conference on Tools with Artificial Intelligence (2006)
Arlington, Virginia
Nov. 13, 2006 to Nov. 15, 2006
ISSN: 1082-3409
ISBN: 0-7695-2728-0
pp: 489-496
Tianming Hu , DongGuan U. of Technology
Chew Lim Tan , National U. of Singapore, Singapore
Sam Yuan Sung , South Texas College, USA
Chao Qu , DongGuan U. of Technology
Wenjun Zhou , Rutgers University, USA
This paper describes a new bipartite formulation for word-document co-clustering such that hyperclique patterns, strongly affiliated documents in this case, are guaranteed not to be split into different clusters. Our approach for pattern preserving clustering consists of three steps: mine maximal hyperclique patterns, form the bipartite, and partition it. With hyperclique patterns of documents preserved, the topic of each cluster can be represented by both the top words from that cluster and the documents in the patterns, which are expected to be more compact and representative than those in the standard bipartite formulation. Experiments with real-world datasets show that, with hyperclique patterns as starting points, we can improve the clustering results in terms of various external clustering criteria. Also, the partitioned bipartite with preserved topical sets of documents naturally lends itself to different functions in search engines.
Tianming Hu, Chew Lim Tan, Sam Yuan Sung, Chao Qu, Wenjun Zhou, "Preserving Patterns in Bipartite Graph Partitioning", 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, vol. 00, no. , pp. 489-496, 2006, doi:10.1109/ICTAI.2006.97
99 ms
(Ver 3.3 (11022016))