loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2005 International Conference on Cyberworlds (CW'05)
A Document Classification Approach By GA Feature Extraction Based Corner Classification Neural Network
Singapore
November 23-November 25
ISBN: 0-7695-2378-1
Weifeng Zhang, Nanjing University of Posts and Telecommunications,China
Baowen Xu, Southeast University, China
Zifeng Cui, Southeast University, China
The CC4 neural network is a new type of corner classification training algorithm for three-layered feed forward neural networks. CC4 is now successfully used in meta search engine Anvish. When the documents are almost of the same size, CC4 nerual network is an effective document classification algorithm. However, there is great difference in document sizes in general, and CC4 use the whole dictionary as the space of vector which leads to a lot of documents represented by sparse vectors. This paper brings forward feature extraction based neural network GA-CC4. The method of GA feature extraction will extract the feature items really representing the documents in the document set, which will be constructed as the set of feature items. Based on the set of feature items and combining the document frequency, the document can be represented. By this method, the dimensions representing the documents can be reduced, which can solve the precise problem caused by the different document sizes, and it can also map the scalar features to the Boolean input of the neural network by binary coding, by which the quality of input data of neural network is improved.
Citation:
Weifeng Zhang, Baowen Xu, Zifeng Cui, "A Document Classification Approach By GA Feature Extraction Based Corner Classification Neural Network," cw, pp.499-504, 2005 International Conference on Cyberworlds (CW'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.