loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 The Eighth IAPR International Workshop on Document Analysis Systems
End-to-End Trainable Thai OCR System Using Hidden Markov Models
September 16-September 19
ISBN: 978-0-7695-3337-7
In this paper we present an end-to-end trainable Optical Character Recognition (OCR) system for recognizing machine-printed text in Thai documents. The end-to-end OCR system is based on a script-independent methodology using hidden Markov models. Our system provides an integrated workflow beginning with annotation and transcription of training images to performing OCR on new images with models trained on transcribed training images. The efficacy of our end-to-end OCR system is demonstrated by rapidly configuring our OCR engine for the Thai script. We present experimental results on Thai documents to highlight the specific challenges posed by the Thai script and analyze the recognition performance as a function of amount of training data.
Index Terms:
OCR, Thai, HHM, script-independent, integrated workflow, end-to-end, annotation, transcription, training, document images, Thai script
Citation:
Kriste Krstovski, Ehry MacRostie, Rohit Prasad, Premkumar Natarajan, "End-to-End Trainable Thai OCR System Using Hidden Markov Models," das, pp.607-614, 2008 The Eighth IAPR International Workshop on Document Analysis Systems, 2008
Usage of this product signifies your acceptance of the Terms of Use.