loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1
Annotated Databases for the Recognition of Screen-Rendered Text
Curitiba, Parana, Brazil
September 23-September 26
ISBN: 0-7695-2822-8
S. Wachenfeld, University of Munster, Germany
H.-U. Klein, University of Munster, Germany
X. Jiang, University of Munster, Germany
The recognition of screen-rendered text is a novel task. It is performed e.g. by translation tools which allow users to click on any text on the screen and give a translation. Also some commercial OCR programs start to address the prob- lem of reading screenshots. Optical character recognition on screen-shot images can be very challenging due to very small and smoothed fonts. In order to build and compare recognition approaches for screen-rendered text, the avail- ability of standard databases is a fundamental prerequisite. In this paper two freely available databases are presented, one that consists of annotated screenshot images of 28 080 single characters and another holding 400 words extracted from documents plus 2 400 generated isolated words. Both databases include meta-information such as x-height, font type, style and rendering conditions. At the example of a de- veloped recognition system, it is shown how these databases can serve for training, testing and optimization.
Citation:
S. Wachenfeld, H.-U. Klein, X. Jiang, "Annotated Databases for the Recognition of Screen-Rendered Text," icdar, vol. 1, pp.272-276, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1, 2007
Usage of this product signifies your acceptance of the Terms of Use.