Digital Libraries, Joint Conference on (2003)
Houston, Texas USA
May 27, 2003 to May 31, 2003
Michael Droettboom , Johns Hopkins University
This paper presents a new technique for dealing with broken characters, one of the major challenges in the optical character recognition (OCR) of degraded historical printed documents. A technique based on graph combinatorics is used to rejoin the appropriate connected components. It has been applied to real data with successful results.
Michael Droettboom, "Correcting Broken Characters in the Recognition of Historical Printed Documents", Digital Libraries, Joint Conference on, vol. 00, no. , pp. 364, 2003, doi:10.1109/JCDL.2003.1204889