The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2000 vol.22)
pp: 385-392
ABSTRACT
<p><b>Abstract</b>—We present a method to automatically localize captions in JPEG compressed images and the I-frames of MPEG compressed videos. Caption text regions are segmented from background images using their distinguishing texture characteristics. Unlike previously published methods which fully decompress the video sequence before extracting the text regions, this method locates candidate caption text regions directly in the DCT compressed domain using the intensity variation information encoded in the DCT domain. Therefore, only a very small amount of decoding is required. The proposed algorithm takes about <tmath>$0.006$</tmath> second to process a <tmath>$240 \times 350$</tmath> image and achieves a recall rate of <tmath>$99.17$</tmath> percent while falsely accepting about <tmath>$1.87$</tmath> percent nontext DCT blocks on a variety of MPEG compressed videos containing more than <tmath>$2,300$</tmath> I-frames.</p>
INDEX TERMS
Caption extraction, text location, texture, compressed video, segmentation, multimedia.
CITATION
Yu Zhong, Hongjiang Zhang, Anil K. Jain, "Automatic Caption Localization in Compressed Video", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.22, no. 4, pp. 385-392, April 2000, doi:10.1109/34.845381
9 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool