Eighth International Conference on Document Analysis and Recognition (ICDAR'05)
Fast Script Word Recognition with Very Large Vocabulary
Seoul, Korea
August 31-September 01
ISBN: 0-7695-2420-6
For an HMM-based script word recognition system an algorithm for fast processing of large lexica is presented. It consists of two steps: First, a lexicon-free recognition is performed, followed by a tree search on the intermediate results of the first step, the trellis of probabilities. Thus, the computational effort for recognition itself can be reduced in the first step, while preserving recognition accuracy by the use of detailed information in the second step. A speedup factor of up to 15x could be obtained compared to traditional tree recognition, making script word recognition with large lexica available to time-critical tasks like in postal automation. There, lexica with e.g. all city or street names (20-500k) have to be processed within a few milliseconds.