The Community for Technology Leaders
2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (2008)
Anchorage, AK, USA
June 23, 2008 to June 28, 2008
ISBN: 978-1-4244-2339-2
pp: 1-8
Niels Haering , ObjectVideo, USA
Asaad Hakeem , ObjectVideo, USA
Song-Chun Zhu , Dept. of Statistic and Computer Science, University of California, Los Angeles, USA
Mun Wai Lee , ObjectVideo, USA
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting events of interest, and representing them semantically in a human readable and query-able format. The SAVE framework is composed of three main components. The first component is an image parsing engine that performs scene content extraction using bottom-up image analysis and a stochastic attribute image grammar, where we define a visual vocabulary from pixels, primitives, parts, objects and scenes, and specify their spatio-temporal or compositional relations; and a bottom-up top-down strategy is used for inference. The second component is an event inference engine, where the Video Event Markup Language (VEML) is adopted for semantic representation, and a grammar-based approach is used for event analysis and detection. The third component is the text generation engine that generates text report using head-driven phrase structure grammar (HPSG). The main contribution of this paper is a framework for an end-to-end system that infers visual events and annotates a large collection of videos. Experiments with maritime and urban scenes indicate the feasibility of the proposed approach.
Niels Haering, Asaad Hakeem, Song-Chun Zhu, Mun Wai Lee, "SAVE: A framework for semantic annotation of visual events", 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 00, no. , pp. 1-8, 2008, doi:10.1109/CVPRW.2008.4562954
93 ms
(Ver 3.1 (10032016))