2013 IEEE 13th International Conference on Data Mining Workshops (2006)

Hong Kong, China

Dec. 18, 2006 to Dec. 22, 2006

ISBN: 0-7695-2702-7

pp: 412-416

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2006.48

A. Sutojo , Dept. of Comput. Sci., San Jose State Univ., CA

J.-D. Hsu , Dept. of Comput. Sci., San Jose State Univ., CA

T.Y. Lin , Dept. of Comput. Sci., San Jose State Univ., CA

ABSTRACT

The collection of the concepts that are discussed in a document set can be represented by a geometric structure, called simplical complex, of combinatorial topology. A simplex is a high-frequency keyword set that co-occurs closely which, we believe, carries a concept in the document set. The collection of all these simplexes that forms the simplical complex represents the structure of these concepts. Based on the topological structure of this complex, the documents are clustered. Several clustering schemes are presented. Our initial experiments, as expected, do support the theory

INDEX TERMS

Topology, Data mining, Association rules, Computer science, Web sites, Clustering methods, Singular value decomposition, Matrix decomposition, Spine, Mathematical analysis

