Eighth International Conference on Document Analysis and Recognition (ICDAR'05) A Comparison of Binarization Methods for Historical Archive Documents Seoul, Korea August 31-September 01 ISBN: 0-7695-2420-6
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDAR.2005.3
This paper compares several alternative binarization algorithms for historical archive documents, by evaluating their effect on end-to-end word recognition performance in a complete archive document recognition system utilising a commercial OCR engine. The algorithms evaluated are: global thresholding; Niblack?s and Sauvola?s algorithms; adaptive versions of Niblack?s and Sauvola?s algorithms; and Niblack?s and Sauvola?s algorithms applied to background removed images. We found that, for our archive documents, Niblack?s algorithm can achieve better performance than Sauvola?s (which has been claimed as an evolution of Niblack?s algorithm), and that it also achieved better performance than the internal binarization provided as part of the commercial OCR engine.
Citation:
J. He, Q. D. M. Do, A. C. Downton, J. H. Kim, "A Comparison of Binarization Methods for Historical Archive Documents," icdar, pp.538-542, Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||