loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
International Conference on Information Technology: Computers and Communications
Phrase-based Text Representation for Managing the Web Documents
Las Vegas, Nevada
April 28-April 30
ISBN: 0-7695-1916-4
Rupali Sharma, Indian Institute of Technology Madras
S. Raman, Indian Institute of Technology Madras
The World Wide Web has provided the facility of bringing information to the fingertips of its users. Since most of the document available on the web are machine-readable but not machine-under tandable, ensuring the retrieval of relevant information continues to be a difficult task. In the traditional text representation approach, high frequency keywords are used as term representative of text. However, the main drawback in this approach are lack of direct relationship between word frequency and it importance, and the effect of the word ambiguities. Considering these shortcomings of the keyword-based method, this paper present a phrase-based text representation approach that uses rule-based Natural Language Processing (NLP) techniques. Extraction of key-phrases from text documents is based on a process of partial parsing. By making the indexing term more meaningful through reduction of the ambiguity in word considered in isolation, improvement in retrieval effectivenes is sought to be achieved.
Index Terms:
phrase-based text representation, text retrieval, semantic web, name recognition
Citation:
Rupali Sharma, S. Raman, "Phrase-based Text Representation for Managing the Web Documents," itcc, pp.165, International Conference on Information Technology: Computers and Communications, 2003
Usage of this product signifies your acceptance of the Terms of Use.