loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2
Automatic Feature Selection with Applications to Script Identification of Degraded Documents
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
Vitaly Ablavsky, Charles River Analytics Inc.
Mark R. Stevens, Charles River Analytics Inc.
Current approaches to script identification rely on hand-selected features and often require processing a significant part of the document to achieve reliable identification. We present an approach that applies a large pool of image features to a small training sample and uses subset feature selection techniques to automatically select a subset with the most discriminating power. At run time we use a classifier coupled with an evidence accumulation engine to report a script label once a preset likelihood threshold has been reached. We apply the system to a diverse corpus of printed Russian and English documents that suffer from common degradation problems. Our validation study shows promising results both in terms of the script identification accuracy and the ability to identify script on the scale of individual words and text lines.
Citation:
Vitaly Ablavsky, Mark R. Stevens, "Automatic Feature Selection with Applications to Script Identification of Degraded Documents," icdar, vol. 2, pp.750, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2, 2003
Usage of this product signifies your acceptance of the Terms of Use.