2013 12th International Conference on Document Analysis and Recognition (2011)
Sept. 18, 2011 to Sept. 21, 2011
In this paper, we present a new method for video script identification which is essential before choosing an appropriate OCR engine for identifying text lines when a video frame contains more than one language. The input for script identification is the text lines obtained by our text detection method. We extract upper and lower extreme points for each connected component of Canny edges of text lines. The extracted points are connected to study the behavior of upper and lower lines. The direction of each 10-pixel segment of the lines is determined using PCA. The average angle of the segments of the upper and lower lines is computed to study the smoothness and cursiveness of the lines. In addition, to discriminate the scripts accurately, the method divides a text line into five equal zones horizontally to study the smoothness and cursiveness of the upper and lower lines of each zone. We evaluate the method by conducting experiments on different combinations of languages such as English and Chinese, English and Tamil, Chinese and Tamil, and English, Chinese and Tamil.
Video text line, Upper and lower points, Smoothness, Cursiveness, Video scrpt line identification
Zhang Ding, Shijian Lu, Palaiahnakote Shivakumara, Trung Quy Phan, Chew Lim Tan, "Video Script Identification Based on Text Lines", 2013 12th International Conference on Document Analysis and Recognition, vol. 00, no. , pp. 1240-1244, 2011, doi:10.1109/ICDAR.2011.250