loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1
Text Identification in Noisy Document Images Using Markov Random Field
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
Yefeng Zheng, University of Maryland
Huiping Li, University of Maryland
David Doermann, University of Maryland
In this paper we address the problem of the identification of text from noisy documents. We segment and identify handwriting from machine printed text because 1) handwriting in a document often indicates corrections, additions or other supplemental information that should be treated differently from the main or body content, and 2) the segmentation and recognition techniques for machine printed text and handwriting are significantly different. Our novelty is that we treat noise as a separate class and model noise based on selected features. Trained Fisher classifiers are used to identify machine printed text and handwriting from noise. We further exploit context to refine the classification. A Markov Random Field (MRF) based approach is used to model the geometrical structure of the printed text, handwriting and noise to rectify the mis-classification. Experimental results show our approach is promising and robust, and can significantly improve the page segmentation results in noise documents.
Citation:
Yefeng Zheng, Huiping Li, David Doermann, "Text Identification in Noisy Document Images Using Markov Random Field," icdar, vol. 1, pp.599, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 1, 2003
Usage of this product signifies your acceptance of the Terms of Use.