CSDL Home C CVPRW 2008 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
Anchorage, AK, USA
June 23, 2008 to June 28, 2008
Asaad Hakeem , ObjectVideo, USA
Niels Haering , ObjectVideo, USA
Mun Wai Lee , ObjectVideo, USA
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting events of interest, and representing them semantically in a human readable and query-able format. The SAVE framework is composed of three main components. The first component is an image parsing engine that performs scene content extraction using bottom-up image analysis and a stochastic attribute image grammar, where we define a visual vocabulary from pixels, primitives, parts, objects and scenes, and specify their spatio-temporal or compositional relations; and a bottom-up top-down strategy is used for inference. The second component is an event inference engine, where the Video Event Markup Language (VEML) is adopted for semantic representation, and a grammar-based approach is used for event analysis and detection. The third component is the text generation engine that generates text report using head-driven phrase structure grammar (HPSG). The main contribution of this paper is a framework for an end-to-end system that infers visual events and annotates a large collection of videos. Experiments with maritime and urban scenes indicate the feasibility of the proposed approach.
Asaad Hakeem, Niels Haering, Mun Wai Lee, "SAVE: A framework for semantic annotation of visual events", CVPRW, 2008, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2008, pp. 1-8, doi:10.1109/CVPRW.2008.4562954