loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fourth International Conference Document Analysis and Recognition (ICDAR'97)
Measuring the Effects of OCR Errors on Similarity Linking
Ulm, GERMANY
August 18-August 20
ISBN: 0-8186-7898-4
Andreas Myka, Wilhelm-Schickard-Institut Universitaet Tuebingen
Ulrich Guentzer, Wilhelm-Schickard-Institut Universitaet Tuebingen
The vector-space model offers an easy and robust model for Information Retrieval. Thereby, the similarities between queries and documents as well as the similarities between documents themselves are of importance. Document similarities may be used in order to generate links between documents that lead users from one document to related ones. Studies have shown that the vector-space model is robust in the context of OCR-processing if manually constructed queries are used. However, it is not clear whether this model, if used for hypertext construction, is robust with regard to data corruption as caused by OCR engines. In this paper, we describe the performance of automatic hypertext construction, based on the vector-space model, with regard to three different measures: the number of overtakings within the used rankings, the accumulated distance of a document's position within the rankings and a comparison based on recall-precision graphs.
Citation:
Andreas Myka, Ulrich Guentzer, "Measuring the Effects of OCR Errors on Similarity Linking," icdar, pp.968, Fourth International Conference Document Analysis and Recognition (ICDAR'97), 1997
Usage of this product signifies your acceptance of the Terms of Use.