This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery
Web Page Classification Based on a Least Square Support Vector Machine with Latent Semantic Analysis
October 18-October 20
ISBN: 978-0-7695-3305-6
Chinese web page classification (WPC) has been considered as a hot research area in data mining. In order to effectively classify web pages, we present a web page categorization based on a least square support vector machine (LS-SVM) with latent semantic analysis (LSA). LSA uses Singular Value Decom- postion (SVD) to obtain latent semantic structure of original term-document matrix solving the polysemous and synonymous keywords problem. LS-SVM is an effective method for learning the classification knowledge from massive data, especially on condition of high cost in getting labeled classical examples. We adopt a novel method of web page expression, and make use of summarization algorithm to reduce the noise of web pages. A preliminary experimental comparison is made showing encouraging results.
Index Terms:
web page classification, least square support vector machine, latent semantic analysis, web page expression, noise reduction
Citation:
Yong Zhang, Bin Fan, Long-bin Xiao, "Web Page Classification Based on a Least Square Support Vector Machine with Latent Semantic Analysis," fskd, vol. 2, pp.528-532, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, 2008
Usage of this product signifies your acceptance of the Terms of Use.