loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1
Fast Selection of Small and Precise Candidate Sets from Dictionaries for Text Correction Tasks
Curitiba, Parana, Brazil
September 23-September 26
ISBN: 0-7695-2822-8
K. Schulz, CIS, University of Munich
S. Mihov, IPP, Bulgarian Academy of Sciences
P. Mitankin, IPP, Bulgarian Academy of Sciences
Lexical text correction relies on a central step where ap- proximate search in a dictionary is used to select the best correction suggestions for an ill-formed input token. In pre- vious work we introduced the concept of a universal Lev- enshtein automaton and showed how to use these automata for efficiently selecting from a dictionary all entries within a fixed Levenshtein distance to the garbled input word. In this paper we look at refinements of the basic Levenshtein distance that yield more sensible notions of similarity in distinct text correction applications, e.g. OCR. We show that the concept of a universal Levenshtein automaton can be adapted to these refinements. In this way we obtain a method for selecting correction candidates which is very ef- ficient, at the same time selecting small candidate sets with high recall.
Citation:
K. Schulz, S. Mihov, P. Mitankin, "Fast Selection of Small and Precise Candidate Sets from Dictionaries for Text Correction Tasks," icdar, vol. 1, pp.471-475, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1, 2007
Usage of this product signifies your acceptance of the Terms of Use.