This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Laplacian Approach to Multi-Oriented Text Detection in Video
February 2011 (vol. 33 no. 2)
pp. 412-419
Palaiahnakote Shivakumara, National University of Singapore, Singapore
Trung Quy Phan, National University of Singapore, Singapore
Chew Lim Tan, National University of Singapore, Singapore
Abstract—In this paper, we propose a method based on the Laplacian in the frequency domain for video text detection. Unlike many other approaches which assume that text is horizontally-oriented, our method is able to handle text of arbitrary orientation. The input image is first filtered with Fourier-Laplacian. K--means clustering is then used to identify candidate text regions based on the maximum difference. The skeleton of each connected component helps to separate the different text strings from each other. Finally, text string straightness and edge density are used for false positive elimination. Experimental results show that the proposed method is able to handle graphics text and scene text of both horizontal and nonhorizontal orientation.

[1] J. Zang and R. Kasturi, "Extraction of Text Objects in Video Documents: Recent Progress," Proc. Eighth Int'l Assoc. Pattern Recognition Int'l Workshop Document Analysis Systems, pp. 5-17, 2008.
[2] K. Jung, K.I. Kim, and A.K. Jain, "Text Information Extraction in Images and Video: A Survey," Pattern Recognition, vol. 37, pp. 977-997, 2004.
[3] Y. Zhong, K. Karu, and A.K. Jain, "Locating Text in Complex Color Images," Proc. Third Int'l Conf. Document Analysis and Recognition, p. 146, 1995.
[4] K. Sobottka, H. Bunke, and H. Kronenberg, "Identification of Text on Colored Book and Journal Covers," Proc. Fifth Int'l Conf. Document Analysis and Recognition, p. 57, 1999.
[5] C. Liu, C. Wang, and R. Dai, "Text Detection in Images Based on Unsupervised Classification of Edge-Based Features," Proc. Eighth Int'l Conf. Document Analysis and Recognition, pp. 610-614, 2005.
[6] M. Cai, J. Song, and M.R. Lyu, "A New Approach for Video Text Detection," Proc. Int'l Conf. Image Processing, pp. 117-120, 2002.
[7] E.K. Wong and M. Chen, "A New Robust Algorithm for Video Text Extraction," Pattern Recognition, vol. 36, pp. 1397-1406, 2003.
[8] Q. Ye, Q. Huang, W. Gao, and D. Zhao, "Fast and Robust Text Detection in Images and Video Frames," Image and Vision Computing, vol. 23, pp. 565-576, 2005.
[9] C.W. Lee, K. Jung, and H.J. Kim, "Automatic Text Detection and Removal in Video Sequences," Pattern Recognition Letters, vol. 24, pp. 2607-2623, 2003.
[10] D. Chen, J.M. Odobez, and J.P. Thiran, "A Localization/Verification Scheme for Finding Text in Images and Video Frames Based on Contrast Independent Features and Machine Learning," Signal Processing: Image Comm. vol. 19, pp. 205-217, 2004.
[11] V.Y. Mariano and R. Kasturi, "Locating Uniform-Colored Text in Video Frames," Proc. Int'l Conf. Pattern Recognition, pp. 539-542, 2000.
[12] K.L Kim, K. Jung, and J.H. Kim, "Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1631-1639, Dec. 2003.
[13] F. Wang, C.W. Ngo, and T.C. Pong, "Structuring Low Quality Videotaped Lectures for Cross-Reference Browsing by Video Text Analysis," Pattern Recognition, vol. 41, pp. 3257-3269, 2008.
[14] D. Crandall, S. Antani, and R. Kasturi, "Extraction of Special Effects Caption Text Events from Digital Video," Int'l J. Document Analysis and Recognition, vol. 5, nos. 2/3, pp. 138-157, 2003.
[15] U. Bhattachatya, S.K. Parui, and S. Mondal, "Devenagari and Bangla Text Extraction from Natural Scene Images," Proc. Int'l Conf. Document Analysis and Recognition, pp. 171-175, 2009.
[16] X. Chen, J. Yang, J. Zhang, and A. Waibel, "Automatic Detection and Recognition of Signs from Natural Scenes," IEEE Trans. Image Processing, vol. 13, no. 1, pp. 87-99, Jan. 2004.
[17] P.P Roy, U. Pal, J. Llados, and M. Delalandre, "Multi-Oriented and Multi-Sized Touching Character Segmentation Using Dynamic Programming," Proc. Int'l Conf. Document Analysis and Recognition, pp. 11-15, 2009.
[18] P.P. Roy, U. Pal, J. Liados, and F. Kimura, "Convex Hull Based Approach for Multi-Oriented Character Recognition from Graphical Documents," Proc. 19th Int'l Conf. Pattern Recognition, pp. 1-4, Dec. 2008.
[19] K. Jung, "Neural Network-Based Text Location in Color Images," Pattern Recognition Letters, vol. 22, pp. 1503-1515, 2001.
[20] X. Tang, X. Gao, J. Liu, and H. Zhang, "A Spatial-Temporal Approach for Video Caption Detection and Recognition," IEEE Trans. Neural Networks, vol. 13, no. 4, pp. 961-971, July 2002.
[21] M.R. Lyu, J. Song, and M. Cai, "A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction," IEEE Trans. Circuits and Systems for Video Technology, vol. 15, no. 2, pp 243-255, Feb. 2005.
[22] R. Lienhart and A. Wernickle, "Localizing and Segmenting Text in Images and Videos," IEEE Trans. Circuits and Systems for Video Technology, vol. 12, no. 4, pp. 256-268, Apr. 2002.
[23] T.Q. Phan, P. Shivakumara, and C.L. Tan, "A Laplacian Method for Video Text Detection," Proc. Int'l Conf. Document Analysis and Recognition, pp. 66-70, 2009.
[24] R.C. Gonzalez and R.E. Woods, Digital Image Processing, pp. 167-180. Addison Wesley and Company, 2002.
[25] X.S. Hua, L. Wenyin, and H.J. Zhang, "An Automatic Performance Evaluation Protocol for Video Text Detection Algorithms," IEEE Trans. Circuits and Systems for Video Technology, vol. 14, no. 4, pp. 498-507, Apr. 2004.
[26] http://algoval.essex.ac.uk/icdarTextLocating.html , 2010.
[27] S.M. Lucas, "ICDAR 2005 Text Locating Competition Results," Proc. Int 'l Conf. Document Analysis and Recognition, vol. 1, p. 8084, 2005.

Index Terms:
Connected component analysis, frequency domain processing, text detection, text orientation.
Citation:
Palaiahnakote Shivakumara, Trung Quy Phan, Chew Lim Tan, "A Laplacian Approach to Multi-Oriented Text Detection in Video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 2, pp. 412-419, Feb. 2011, doi:10.1109/TPAMI.2010.166
Usage of this product signifies your acceptance of the Terms of Use.