loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2
Multi-Script Line identification from Indian Documents
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
U. Pal, Indian Statistical Institute
S. Sinha, Indian Statistical Institute
B. B. Chaudhuri, Indian Statistical Institute
A document page may contain two or more different scripts. For Optical Character Recognition (OCR) of such a document page, it is necessary to separate different scripts before feeding them to their individual OCR system. In this paper an automatic scheme is presented to identify text lines of different Indian scripts from a document. For the separation task at first the scripts are grouped into a few classes according to script characteristics. Next feature based on water reservoir principle, contour tracing, profile etc. are employed to identify them without any expensive OCR-like algorithms. At present, the system has an overall accuracy of about 97.52%.
Citation:
U. Pal, S. Sinha, B. B. Chaudhuri, "Multi-Script Line identification from Indian Documents," icdar, vol. 2, pp.880, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2, 2003
Usage of this product signifies your acceptance of the Terms of Use.