Eighth International Conference on Document Analysis and Recognition (ICDAR'05)
Determining Optimal Filters for Binarization of Degraded Grayscale Characters Using Genetic Algorithms
Seoul, Korea
August 31-September 01
ISBN: 0-7695-2420-6
Optimal binarization of degraded grayscale characters is a crucial step to subsequent character recognition. This paper proposes a new, promising binalization technique of grayscale characters using genetic algorithms (GA) to search for an optimal sequence of filters from among a set of rather simple, representative image processing filters. First, we classify degraded samples of grayscale characters into several categories. Then, in the learning stage, by selecting a training sample from each degradation category we apply GA to the combinatorial optimization problem of determining a sequence of filters that maximizes the fitness value between the filtered training sample and its target image ideally binarized by humans. Finally, in the testing stage, we apply the optimal sequence of filters thus obtained to remaining test samples for each degradation category. Experiments using the public ICDAR 2003 robust OCR dataset demonstrate promising results of binarization of grayscale characters against a wide variety of degradation causes.
Citation:
Yusuke Ojima, Satoshi Kirigaya, Toru Wakahara, "Determining Optimal Filters for Binarization of Degraded Grayscale Characters Using Genetic Algorithms," icdar, pp.555-559, Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005