loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
15th International Conference on Pattern Recognition (ICPR'00) - Volume 4
Improved Degraded Document Recognition with Hybrid Modeling Techniques and Character N-Grams
Barcelona, Spain
September 03-September 08
ISBN: 0-7695-0750-6
Anja Brakensiek, Gerhard-Mercator-University Duisburg
Daniel Willett, Gerhard-Mercator-University Duisburg
Gerhard Rigoll, Gerhard-Mercator-University Duisburg
In this paper, a robust multi-font character recognition system for degraded documents such as photocopy or fax is described. The system is based on Hidden Markov Models (HMMs) using discrete and hybrid modeling techniques, where the latter makes use of an information theory-based neural network. The presented recognition results refer to the SEDAL-database of English documents using no dictionary. It is also demonstrated that the usage of a language model, that consists of character n-grams yields significantly better recognition results. Our resulting system clearly outperforms commercial systems and leads to further error rate reductions compared to previous results reached on this database.
Citation:
Anja Brakensiek, Daniel Willett, Gerhard Rigoll, "Improved Degraded Document Recognition with Hybrid Modeling Techniques and Character N-Grams," icpr, vol. 4, pp.4438, 15th International Conference on Pattern Recognition (ICPR'00) - Volume 4, 2000
Usage of this product signifies your acceptance of the Terms of Use.