|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
15th International Conference on Pattern Recognition (ICPR'00) - Volume 4
Generating Natural Language Description of Human Behavior from Video Images
Barcelona, Spain
September 03-September 08
ISBN: 0-7695-0750-6
| ASCII Text | x | ||
| Atsuhiro Kojima, Masao Izumi, Takeshi Tamura, Kunio Fukunaga, "Generating Natural Language Description of Human Behavior from Video Images," Pattern Recognition, International Conference on, vol. 4, pp. 4728, 15th International Conference on Pattern Recognition (ICPR'00) - Volume 4, 2000. | |||
| BibTex | x | ||
| @article{ 10.1109/ICPR.2000.903020, author = {Atsuhiro Kojima and Masao Izumi and Takeshi Tamura and Kunio Fukunaga}, title = {Generating Natural Language Description of Human Behavior from Video Images}, journal ={Pattern Recognition, International Conference on}, volume = {4}, year = {2000}, isbn = {0-7695-0750-6}, pages = {4728}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICPR.2000.903020}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Pattern Recognition, International Conference on TI - Generating Natural Language Description of Human Behavior from Video Images SN - 0-7695-0750-6 SP EP A1 - Atsuhiro Kojima, A1 - Masao Izumi, A1 - Takeshi Tamura, A1 - Kunio Fukunaga, PY - 2000 VL - 4 JA - Pattern Recognition, International Conference on ER - | |||
In visual surveillance applications, it is becoming popular to perceive video images and to interpret them using natural language concepts. In this paper, we propose a new approach to generate natural language description of human behavior appeared in real video images. First, a head region of a human, on behalf of the whole body, is extracted from each frame. Using a model-based method, three dimensional pose and position of the head are estimated. Next, the trajectory of these parameters is divided into segments of monotonous motions. For each segment, we evaluate conceptual features such as degree of change of pose and position and that of relative distance to some objects in the surroundings, and so on. By calculating product of these feature values, a most suitable verb is selected and other syntactic elements are supplied. Finally, natural language text is generated using technique of machine translation.
Citation:
Atsuhiro Kojima, Masao Izumi, Takeshi Tamura, Kunio Fukunaga, "Generating Natural Language Description of Human Behavior from Video Images," icpr, vol. 4, pp.4728, 15th International Conference on Pattern Recognition (ICPR'00) - Volume 4, 2000
Usage of this product signifies your acceptance of the Terms of Use.
