Los Angeles, CA
March 31, 2009 to April 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSIE.2009.107
Caption text provides valuable information about contents in video sequences. In this paper, an efficient method to locate candidate caption text regions of video directly in the DCT compressed domain is proposed. Candidate text blocks are detected in terms of DCT texture energy. A 3×3 median filter is used as spatial constraint to refine the text regions. An adaptive temporal constraint method is designed according to the same caption text last for at least two seconds. Finally we convert the extracted text regions into HSV color space to generate binary text images that required by commercial OCRs. Experimental results on several video sequences show that the proposed algorithm is efficient to detect and extract caption text in MPEG video sequences with various scene complexities.
Caption Text, DCT, Compressed Domain, Texture Energy, Text extraction
Xiuhua Jiang, Jiangbo Xu, "Caption Text Extraction Using DCT Feature in MPEG Compressed Video", CSIE, 2009, 2009 WRI World Congress on Computer Science and Information Engineering, CSIE, 2009 WRI World Congress on Computer Science and Information Engineering, CSIE 2009, pp. 431-434, doi:10.1109/CSIE.2009.107