2013 12th International Conference on Document Analysis and Recognition (2003)
Aug. 3, 2003 to Aug. 6, 2003
Yue Lu , National University of Singapore
Chew Lim Tan , National University of Singapore
In this paper, we present a compressed pattern matching method for searching user queried words in the CCITT Group 4 compressed document images, without decompressing. The feature pixels composed of black changing elements and white changing elements are extracted directly from the CCITT Group 4 compressed document images. The connected components are labeled based on a line-by-line strategy according to the relative positions between the changing elements of the current coding line and the changing elements of the reference line. Word boxes are bounded by merging the connected components. A two-stage matching strategy is constructed to measure the dissimilarity between the template image of the user?s query word and the words extracted from document images. Experimental results confirmed the validity of the proposed approach.
Yue Lu, Chew Lim Tan, "Word Searching in CCITT Group 4 Compressed Document Images", 2013 12th International Conference on Document Analysis and Recognition, vol. 01, no. , pp. 467, 2003, doi:10.1109/ICDAR.2003.1227709