loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
International Workshop on Challenges in Web Information Retrieval and Integration
Postal Address Detection fromWeb Documents
Tokyo, Japan
April 08-April 09
ISBN: 0-7695-2414-1
Lin Can, School of Information, Renmin University Beijing, PRC
Zhang Qian, School of Information, Renmin University Beijing, PRC
Meng Xiaofeng, School of Information, Renmin University Beijing, PRC
Liu Wenyin, Department of Computer Science, City University of Hong Kong Tat Chee Avenue, Hong Kong SAR, PRC

An approach to postal address detection from webpages is proposed. The webpages are first segmented into text blocks based on their visual similarity. The text content in each block undergoes the recognition process, which employs a syntactic approach. The grammars of almost all possible patterns of postal addresses are built for this purpose. The results of our preliminary experiments on 44 webpages with 56 true addresses show that our approach can detect the postal addresses with a high precision (89.3%) and a low false alarms rate (3.8%).

Citation:
Lin Can, Zhang Qian, Meng Xiaofeng, Liu Wenyin, "Postal Address Detection fromWeb Documents," wiri, pp.40-45, International Workshop on Challenges in Web Information Retrieval and Integration, 2005
Usage of this product signifies your acceptance of the Terms of Use.