17th International Conference on Pattern Recognition (ICPR'04) - Volume 1
A Novel Video Caption Detection Approach Using Multi-Frame Integration
Cambridge UK
August 23-August 26
ISBN: 0-7695-2128-2
Lide Wu, Fudan University, Shanghai, China
Captions in videos often play an important role in video information indexing and retrieval. In this paper, we present a novel video caption detection approach. We first apply a new Multiple Frames Integration (MFI) method to minimize the variation of the background of the image. A time-based minimum (or maximum)pixel value search is employed and Sobel edge map is used to determine the mode of search. Then block-based text detection is performed, i.e. a small window is used to scan the image and classified as text or non-text, using Sobel edges as features. We use a two-level pyramid to detect various text sizes. Finally, we present a new iterative text line decomposition method and accurate text bounding boxes are extracted from candidate text areas. Experimental result shows that the proposed approach achieves a high precision and recall.
Citation:
Rongrong Wang, Wanjun Jin, Lide Wu, "A Novel Video Caption Detection Approach Using Multi-Frame Integration," icpr, vol. 1, pp.449-452, 17th International Conference on Pattern Recognition (ICPR'04) - Volume 1, 2004