loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
International Workshop on Challenges in Web Information Retrieval and Integration
News Item Extraction for Text Mining inWeb Newspapers
Tokyo, Japan
April 08-April 09
ISBN: 0-7695-2414-1
kjetil Norvag, Department of Computer and Information Science, Norwegian University of Science and Technology Trondheim, Norway
Randi Oyri, Department of Computer and Information Science, Norwegian University of Science and Technology Trondheim, Norway

Web newspapers provide a valuable resource for information. In order to benefit more from the available information, text mining techniques can be applied. However, because each newspaper page often covers a lot of unrelated topics, page-based data mining will not always give useful results. In order to improve on complete-page mining, we present an approach based on extracting the individual news items from the web pages and mining these separately. Automatic news item extraction is a difficult problem, and in this paper we also provide strategies solving that task. We study the quality of the news item extraction, and also provide results from clustering the extracted news items.

Citation:
kjetil Norvag, Randi Oyri, "News Item Extraction for Text Mining inWeb Newspapers," wiri, pp.195-204, International Workshop on Challenges in Web Information Retrieval and Integration, 2005
Usage of this product signifies your acceptance of the Terms of Use.