loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
First IEEE International Conference on Data Mining (ICDM'01)
Document Clustering and Cluster Topic Extraction in Multilingual Corpora
San Jose, California
November 29-December 02
ISBN: 0-7695-1119-8
A statistics-based approach for clustering documents and for extracting cluster topics is described. Relevant (meaningful) Expressions (REs) automatically extracted from corpora are used as clustering base features. These features are transformed and its number is strongly reduced in order to obtain a small set of document classification features. This is achieved on the basis of Principal Components Analysis. Model-Based Clustering Analysis finds the best number of clusters. Then, the most important REs are extracted from each cluster and taken as document cluster topics.
Citation:
Joaquim Silva, João Mexia, Agra Coelho, Gabriel Lopes, "Document Clustering and Cluster Topic Extraction in Multilingual Corpora," icdm, pp.513, First IEEE International Conference on Data Mining (ICDM'01), 2001
Usage of this product signifies your acceptance of the Terms of Use.