The Community for Technology Leaders
2011 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) (2011)
Washington, DC, USA
Oct. 11, 2011 to Oct. 13, 2011
ISBN: 978-1-4673-0215-9
pp: 1-8
Tayfun Tuna , University of Houston, Computer Science Department, TX 77204-3010, USA
Jaspal Subhlok , University of Houston, Computer Science Department, TX 77204-3010, USA
Shishir Shah , University of Houston, Computer Science Department, TX 77204-3010, USA
ABSTRACT
Lecture videos have been commonly used to supplement in-class teaching and for distance learning. Videos recorded during in-class teaching and made accessible online are a versatile resource on par with a textbook and the classroom itself. Nonetheless, the adoption of lecture videos has been limited, in large part due to the difficulty of quickly accessing the content of interest in a long video lecture. In this work, we present "video indexing" and "keyword search" that facilitate access to video content and enhances user experience. Video indexing divides a video lecture into segments indicating different topics by identifying scene changes based on the analysis of the difference image from a pair of video frames. We propose an efficient indexing algorithm that leverages the unique features of lecture videos. Binary search with frame sampling is employed to efficiently analyze long videos. Keyword search identifies video segments that match a particular keyword. Since text in a video frame often contains a diversity of colors, font sizes and backgrounds, our text detection approach requires specialized preprocessing followed by the use of off-the-shelf OCR engines, which are designed primarily for scanned documents. We present image enhancements: text segmentation and inversion, to increase detection accuracy of OCR tools. Experimental results on a suite of diverse video lectures were used to validate the methods developed in this work. Average processing time for a one-hour lecture is around 14 minutes on a typical desktop. Search accuracy of three distinct OCR engines - Tesseract, GOCR and MODI increased significantly with our preprocessing transformations, yielding an overall combined accuracy of 97%. The work presented here is part of a video streaming framework deployed at multiple campuses serving hundreds of lecture videos.
INDEX TERMS
CITATION

S. Shah, T. Tuna and J. Subhlok, "Indexing and keyword search to ease navigation in lecture videos," 2011 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), Washington, DC, USA, 2011, pp. 1-8.
doi:10.1109/AIPR.2011.6176364
160 ms
(Ver 3.3 (11022016))