loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1
Document Image Retrieval Based on 2D Density Distributions of Terms with Pseudo Relevance Feedback
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
Koichi Kise, Osaka Prefecture University
Yin Wuotang, Osaka Prefecture University
Keinosuke Matsumoto, Osaka Prefecture University
Document image retrieval is a task to retrieve document images relevant to a user?s query. Most of existing methods based on word-level indexing rely on the representation called "bag of words" which originated in the field of information retrieval. This paper presents a new representation of documents that utilizes additional information about the location of words in pages so as to improve the retrieval performance. We consider that pages are relevant to a query if they contains its terms densely. This notion is embodied as density distributions of terms calculated in the proposed method. Its performance is improved with the help of "pseudo relevance feedback", i.e., a method of expanding a query by analyzing pages. Experimental results on English document images show that the proposed method is superior to conventional methods of electronic document retrieval at recall levels 0.0-0.6.
Citation:
Koichi Kise, Yin Wuotang, Keinosuke Matsumoto, "Document Image Retrieval Based on 2D Density Distributions of Terms with Pseudo Relevance Feedback," icdar, vol. 1, pp.488, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1, 2003
Usage of this product signifies your acceptance of the Terms of Use.