loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
16th International Conference on Pattern Recognition (ICPR'02) - Volume 1
Discriminative Features for Document Classification
Quebec City, QC, Canada
August 11-August 15
ISBN: 0-7695-1695-X
Kari Torkkola, Motorola Labs
Document representation using the bag-of-words approach may require bringing the dimensionality of the representation down in order to be able to make effective use of various statistical classification methods. Latent Semantic Indexing (LSI) is one such method that is based on eigendecomposition of the covariance of the document-term matrix. Another often used approach is to select a small number of most important features out of the whole set according to some relevant criterion. This paper points out that LSI ignores discrimination while concentrating on representation. Furthermore, selection methods fail to produce a feature set that jointly optimizes class discrimination. As a remedy, we suggest supervised linear discriminative transforms, and report good classification results applying these to the Reuters-21578 database.
Citation:
Kari Torkkola, "Discriminative Features for Document Classification," icpr, vol. 1, pp.10472, 16th International Conference on Pattern Recognition (ICPR'02) - Volume 1, 2002
Usage of this product signifies your acceptance of the Terms of Use.