| | This Article | |
| |
| |
| | Share | |
| |
| |
| | Bibliographic References | |
| |
| |
| | Add to: | |
| |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| |
| | Search | |
| |
| |
| | |
Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm
December 2003 (vol. 25 no. 12)
pp. 1631-1639
Abstract— The current paper presents a novel texture-based method for detecting texts in images. A support vector machine (SVM) is used to analyze the textural properties of texts. No external texture feature extraction module is used; rather, the intensities of the raw pixels that make up the textural pattern are fed directly to the SVM, which works well even in high-dimensional spaces. Next, text regions are identified by applying a continuously adaptive mean shift algorithm (CAMSHIFT) to the results of the texture analysis. The combination of CAMSHIFT and SVMs produces both robust and efficient text detection, as time-consuming texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be texture-analyzed.
[1] 1631 R. Lienhart, "Automatic Text Recognition for Video Indexing," Proc. ACM Multimedia Conf., pp. 11-20,Boston, 1996.[2] A.K. Jain and B. Yu, Automatic Text Location in Images and Video Frames Pattern Recognition, vol. 31, no. 12, pp. 2055-2076, 1998.[3] E.Y. Kim, K. Jung, K.Y. Jeong, and H.J. Kim, Automatic Text Region Extraction Using Cluster-Based Templates Proc. Int'l Conf. Advance in Pattern Recognition and Digital Techniques, pp. 412-421, 2000.[4] H. Li, D. Doermann, and O. Kia, Automatic Text Detection and Tracking in Digital Video IEEE Trans. Image Processing, vol. 9, no. 1, pp. 147-156, 2000.[5] A.K. Jain and Y. Zhong, Page Segmentation Using Texture Analysis Pattern Recognition, vol. 29, no. 5, pp. 743-770, 1996.[6] Y. Zhong, K. Karu, and A.K. Jain, Locating Text in Complex Color Images Pattern Recognition, vol. 28, no. 10, pp. 1523-1535, 1995.[7] V. Wu, R. Manmatha, and E.M. Riseman, TextFinder: An Automatic System to Detect and Recognize Text in Images IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 11, pp. 1224-1229, Nov. 1999.[8] K.I. Kim, K. Jung, S.H. Park, and H.J. Kim, Supervised Texture Segmentation Using Support Vector Machines IEE Electronics Letters, vol. 35, no. 22, pp. 1935-1936, 1999.[9] E. Osuna, R. Freund, and F. Girosi, Training Support Vector Machines: An Application to Face Detection Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 17-19, 1997.[10] V.P. Kumar and T. Poggio, Learning-Based Approach to Real Time Tracking and Analysis of Faces Proc. IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 96-101, 2000.[11] G.R. Bradski, “Computer Vision Face Tracking as a Component of a Perceptual User Interface,” Proc. IEEE Workshop Applications of Computer Vision, pp. 214-219, Oct. 1998.[12] V.N. Vapnik, Statistical Learning Theory, John Wiley&Sons, 1998.[13] B. Scholkopf, K. Sung, C.J.C. Burges, and F. Girosi, Comparing Support Vector Machines with Gaussian Kernels to Radial Basis Function Classifiers IEEE Trans. Signal Processing, vol. 45, no. 11, pp. 2758-2765, 1999.[14] S. Haykin, Neural Network—A Comprehensive Foundation, second ed. Prentice Hall, 1999.[15] T.M. Cover, Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition IEEE Trans. Electronic Computers, vol. 14, pp. 326-334, 1965.[16] C.J.C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 1-47, 1998.[17] K.K. Sung and T. Poggio, "Example-Based Learning for View-Based Human Face Detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 39-50, Jan. 1998.[18] T. Joachims, "Making Large-Scale SVM Learning Practical," to be published in Advances in Kernel Methods—Support Vector Learning, MIT Press, 1998.[19] H. Li and D. Doermann, A Video Text Detection System Based on Automated Training Proc. Int'l Conf. Pattern Recognition, pp. 223-226, 2000.[20] K.Y. Jeong, K. Jung, E.Y. Kim, and H.J. Kim, Neural Network-Based Text Location for News Video Indexing Proc. Int'l Conf. Image Processing, pp. 319-323, 1999.[21] E.Y. Kim, K.I. Kim, K. Jung, and H.J. Kim, A Video Indexing System Using Character Recognition Proc. IEEE Int'l Conf. Consumer Electronics, pp. 358-359, 2000.[22] Y. Cheng, Mean Shift, Mode Seeking, and Clustering IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 790-799, Aug. 1995.[23] D. Comaniciu, V. Ramesh, and P. Meer, Real-Time Tracking of Non-Rigid Objects Using Mean Shift Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 142-149, 2000.[24] D. Comaniciu and P. Meer, Mean Shift Analysis and Applications Proc. IEEE Seventh Int'l Conf. Computer Vision, vol. 2, pp. 1197-1203, Sept. 1999.[25] Praktische Informatik IV, MOCA Project,http://www.informatik.uni-mannheim.de/informatik/ pi4/projectsMoCA, 2003.[26] Y. Zhong, H. Zangh, and A.K. Jain, "Automatic Caption Localization in Compressed Video," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 4, Apr. 2000, pp. 385-392.[27] M.A. Smith and T. Kanade, "Video Skimming and Characterization Through the Combination of Image and Language Understanding Techniques," Computer Vision and Pattern Recognition, pp. 775-781, 1997.[28] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley-Interscience, 1973.[29] L. Agnihotri and N. Dimitrova, Text Detection for Video Analysis Proc. IEEE Workshop Content-Based Access of Image and Video Libraries, pp. 109-113, 1999.[30] B. Schölkopf, Support Vector Learning PhD thesis, Munich: Oldenbourg Verlag, 1997.[31] F. Idris and S. Panchanathan, Review of Image and Video Indexing Techniques J. Visual Comm. and Image Representation, vol. 8, no. 2, pp. 146-166, 1997.[32] A.K. Jain, Fundamentals of Digital Image Processing. Prentice Hall, 1989.[33] E. Osuna, R. Freund, and F. Girosi, An Improved Training Algorithm for Support Vector Machines Proc. IEEE Workshop Neural Networks and Signal Processing, Sept. 1997.[34] Language and Media Processing (LAMP) Laboratory, media group, Univ. of Maryland, College Park,http:/lamp.cfar.umd.edu, 2003.
Index Terms:
Text detection, image indexing, texture analysis, support vector machine, CAMSHIFT.
Citation:
Kwang In Kim, Keechul Jung, Jin Hyung Kim, "Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1631-1639, Dec. 2003, doi:10.1109/TPAMI.2003.1251157