This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Parallel-Line Detection Algorithm Based on HMM Decoding
May 2005 (vol. 27 no. 5)
pp. 777-792
The detection of groups of parallel lines is important in applications such as form processing and text (handwriting) extraction from rule lined paper. These tasks can be very challenging in degraded documents where the lines are severely broken. In this paper, we propose a novel model-based method which incorporates high-level context to detect these lines. After preprocessing (such as skew correction and text filtering), we use trained Hidden Markov Models (HMM) to locate the optimal positions of all lines simultaneously on the horizontal or vertical projection profiles, based on the Viterbi decoding. The algorithm is trainable so it can be easily adapted to different application scenarios. The experiments conducted on known form processing and rule line detection show our method is robust, and achieves better results than other widely used line detection methods.

[1] Y. Zheng, C. Liu, and X. Ding, “Form Frame Line Detection With Directional Single-Connected Chain,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 699-703, 2001.
[2] J. Illingworth and J. Kittler, “A Survey of the Hough Transform,” Graphical Model and Image Processing, vol. 44, no. 1, pp. 87-116, 1988.
[3] J. Liu, X. Ding, and Y. Wu, “Description and Recognition of Form and Automated Form Data Entry,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 579-582, 1995.
[4] J.-L. Chen and H.-J. Lee, “An Efficient Algorithm for Form Structure Extraction Using Strip Projection,” Pattern Recognition, vol. 31, no. 9, pp. 1353-1368, 1998.
[5] A.K. Chhabra, V. Misra, and J. Arias, “Detection of Horizontal Lines in Noisy Run Length Encoded Images: The FAST Method,” Proc. IAPR Int'l Workshop Graphics Recognition, pp. 35-48, 1995.
[6] J.F. Arias, C.P. Lan, S. Surya, R. Kasturi, and A. Chhabra, “Interpretation of Telephone System Manhole Drawings,” Pattern Recognition Letters, vol. 16, no. 4, pp. 355-369, 1995.
[7] B. Yu and A.K. Jain, “A Generic System for Form Dropout,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 11, pp. 1127-1131, Nov. 1996.
[8] D. Dori and L. Wenyin, “Sparse Pixel Vectorization: An Algorithm and Its Performance Evaluation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 3, pp. 202-215, Mar. 1999.
[9] O. Hori and D. Doermann, “Robust Table-Form Structure Analysis Based on Box-Driven Reasoning,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 218-221, 1995.
[10] D. Dori, Y. Liang, and J. Dowell, “Sparse-Pixel Recognition of Primitives in Engineering Drawings,” Machine Vision and Applications, vol. 6, nos. 2-3, pp. 69-82, 1993.
[11] F. Cesarini, M. Gori, and S. Marinai, “INFORMys: A Flexible Invoice-Like Form-Reader System,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 7, pp. 730-745, July 1998.
[12] Y.Y. Tang, C.Y. Suen, C.D. Yan, and M. Cheriet, “Financial Document Processing Based on Staff Line and Description Language,” IEEE Trans. Systems, Man and Cybernetics, vol. 25, no. 5, pp. 738-753, 1995.
[13] D. Blostein and H.S. Baird, “A Critical Survey of Music Image Analysis,” Structured Document Image Analysis, H.S. Baird et al., eds., pp. 405-434, 1992.
[14] P.V.C. Hough, “Machine Analysis of Bubble Chamber Pictures,” Proc. Int'l Conf. High Energy Accelerators and Instrumentation, 1959.
[15] H. Tamura, “A Comparison of Line Thinning Algorithms from Digital Geometry Viewpoint,” Proc. Int'l Conf. Pattern Recognition, pp. 715-719, 1978.
[16] J. Jimenez and J.L. Navalon, “Some Experiments in Image Vectorization,” IBM J. Research Development, vol. 26, pp. 724-734, 1982.
[17] O. Hori and S. Tanigawa, “Raster-to-Vector Conversion by Line Fitting Based on Contours and Skeletons,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 353-358, 1993.
[18] J.F. Arias, A. Chhabra, and V. Misra, “Finding Straight Lines in Drawings,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 788-791, 1997.
[19] V.P. d'Andecy, J. Camillerapp, and I. Leplumey, “Kalman Filtering for Segment Detection: Application to Music Score Analysis,” Proc. Int'l Conf. Pattern Recognition, pp. 301-305, 1994.
[20] D. Dori, L. Wenyin, and M. Peleg, “How to Win a Dashed Line Detection Contest,” Proc. IAPR Int'l Workshop Graphics Recognition, pp. 286-300, 1997.
[21] J.W. Roach and J.E. Tatem, “Using Domain Knowledge in Low-Level Visual Processing to Interpret Handwritten Music: An Experiment,” Pattern Recognition, vol. 21, no. 1, pp. 33-44, 1988.
[22] L.Y. Tseng and R.C. Chen, “Recognition and Data Extraction of Form Documents Based on Three Types of Line Segments,” Pattern Recognition, vol. 31, no. 10, pp. 1525-1540, 1998.
[23] A. Ting and M.K. Leung, “Form Recognition Using Linear Structure,” Pattern Recognition, vol. 32, no. 4, pp. 645-656, 1999.
[24] J. Liu and A.K. Jain, “Image-Based Form Document Retrieval,” Pattern Recognition, vol. 33, no. 3, pp. 503-513, 2000.
[25] K.-C. Fan, J.-M. Lu, and G.-D. Chen, “A Feature Point Clustering Approach to the Recognition of Form Documents,” Pattern Recognition, vol. 31, no. 9, pp. 1205-1220, 1998.
[26] J.J. Hull, “Document Image Skew Detection: Survey and Annotated Bibliography,” Document Analysis Systems II, J.J. Hull and S.L. Taylor, eds., pp. 40-64, 1998.
[27] S.W. Lam, L. Javanbakht, and S.N. Srihari, “Anatomy of a Form Reader,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 506-509, 1993.
[28] Y. Zheng, H. Li, and D. Doermann, “A Parallel Line Detection Algorithm Based on HMM Decoding,” Technical Report LAMP-TR-109, Univ. of Maryland, College Park, 2003.
[29] T. Kanungo and R.M. Haralick, “An Automatic Closed-Loop Methodology for Generating Character Groundtruth for Scanned Documents,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 2, pp. 179-183, Feb. 1999.
[30] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257-286, 1989.
[31] R. Plamondon and S.N. Srihari, “On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63-84, Jan. 2000.
[32] S.E. Levinson, “Continuously Variable Duration Hidden Markov Models for Automatic Speech Recognition,” Computer, Speech, and Language, vol. 1, no. 1, pp. 29-45, 1986.
[33] D.E. Rumelhart and J.L. McClelland, Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, 1986.
[34] K.J. Lang, A.H. Waibel, and G.E. Hinton, “A Time-Delay Neural Network Architecture for Isolated Word Recognition,” Neural Networks, vol. 3, pp. 23-43, 1990.
[35] D. Jurafsky and J.H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, 2000.
[36] J. Nelder and R. Mead, “A Simplex Method for Function Minimization,” Computer J., vol. 7, no. 4, pp. 308-313, 1965.
[37] T. Kanungo, R.M. Haralick, and I.T. Phillips, “Nonlinear Local and Global Document Degradation Models,” Int'l J. Imaging Systems and Technology, vol. 5, no. 4, pp. 220-230, 1994.
[38] G. Grimmett and D. Stirzaker, Probability and Random Processes, second ed. Oxford Univ. Press, 2001.
[39] L. Wenyin and D. Dori, “A Protocol for Performance Evaluation of Line Detection Algorithms,” Machine Vision and Applications, vol. 9, nos. 5-6, pp. 57-68, 1997.
[40] D.P. Huttenlocher, G.A. Klanderman, and W.J. Rucklidge, “Comparing Images Using the Hausdorff Distance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 9, pp. 850-863, Sept. 1993.
[41] B. Kong, I.T. Phillips, and R.M. Haralick, “A Benchmark: Performance Evaluation of Dashed-Line Detection Algorithm,” Proc. IAPR Int'l Workshop Graphics Recognition, pp. 270-285, 1995.
[42] D.L. Dimmick, M.D. Garris, and C.L. Wilson, NIST Structured Forms Reference Set, http://www.nist.gov/srdnistsd2.htm, 1991.

Index Terms:
Line detection, form processing, form registration, form identification, hidden Markov model, document image analysis.
Citation:
Yefeng Zheng, Huiping Li, David Doermann, "A Parallel-Line Detection Algorithm Based on HMM Decoding," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 777-792, May 2005, doi:10.1109/TPAMI.2005.89
Usage of this product signifies your acceptance of the Terms of Use.