Sixth International Conference on Data Mining (ICDM'06) (2006)
Dec. 18, 2006 to Dec. 22, 2006
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.98
Dimitrios Zeimpekis , University of Patras, Greece
Efstratios Gallopoulos , University of Patras, Greece
We address the problem of building fast and effective text classification tools. We describe a "representatives methodology" related to feature extraction and illustrate its performance using as vehicles a centroid based method and a method based on clustered LSI that were recently proposed as useful tools for low rank matrix approximation and cost effective alternatives to LSI. The methodology is very flexible, providing the means for accelerating existing algorithms. It is also combined with kernel techniques to enable the analysis of data for which linear techniques are insufficient. Numerous classification examples indicate that the proposed technique is effective and efficient with an overall performance superior than existing linear and nonlinear LSI-based approaches.
D. Zeimpekis and E. Gallopoulos, "Linear and Non-Linear Dimensional Reduction via Class Representatives for Text Classification," Sixth International Conference on Data Mining (ICDM'06)(ICDM), Hong Kong, 2006, pp. 1172-1177.