The Community for Technology Leaders
Computer Science and Information Engineering, World Congress on (2009)
Los Angeles, California USA
Mar. 31, 2009 to Apr. 2, 2009
ISBN: 978-0-7695-3507-4
pp: 431-434
Caption text provides valuable information about contents in video sequences. In this paper, an efficient method to locate candidate caption text regions of video directly in the DCT compressed domain is proposed. Candidate text blocks are detected in terms of DCT texture energy. A 3×3 median filter is used as spatial constraint to refine the text regions. An adaptive temporal constraint method is designed according to the same caption text last for at least two seconds. Finally we convert the extracted text regions into HSV color space to generate binary text images that required by commercial OCRs. Experimental results on several video sequences show that the proposed algorithm is efficient to detect and extract caption text in MPEG video sequences with various scene complexities.
Caption Text, DCT, Compressed Domain, Texture Energy, Text extraction

Y. Wang, X. Jiang and J. Xu, "Caption Text Extraction Using DCT Feature in MPEG Compressed Video," 2009 WRI World Congress on Computer Science and Information Engineering, CSIE(CSIE), Los Angeles, CA, 2009, pp. 431-434.
89 ms
(Ver 3.3 (11022016))