loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2006 IEEE International Conference on Multimedia and Expo
Conversation Scene Analysis with Dynamic Bayesian Network Basedon Visual Head Tracking
Toronto, ON, Canada
July 09-July 12
ISBN: 1-4244-0366-7
Kazuhiro Otsuka, NTT Communication Science Labs, otsuka@eye.brl.ntt.co.jp
Junji Yamato, NTT Communication Science Labs, yamato@eye.brl.ntt.co.jp
Yoshinao Takemae, NTT Cyber Solutions Labs, takemae.yoshinao@lab.ntt.co.jp
Hiroshi Murase, Nagoya University, murase@is.nagoya-u.ac.jp
A novel method based on a probabilistic model for conversation scene analysis is proposed that can infer conversation structure from video sequences of face-to-face communication. Conversation structure represents the type of conversation such as monologue or dialogue, and can indicate who is talking / listening to whom. This study assumes that the gaze directions of participants provide cues for discerning the conversation structure, and can be identified from head directions. For measuring head directions, the proposed method newly employs a visual head tracker based on Sparse-Template Condensation. The conversation model is built on a dynamic Bayesian network and is used to estimate the conversation structure and gaze directions from observed head directions and utterances. Visual tracking is conventionally thought to be less reliable than contact sensors, but experiments confirm that the proposed method achieves almost comparable performance in estimating gaze directions and conversation structure to a conventional sensor-based method.
Citation:
Kazuhiro Otsuka, Junji Yamato, Yoshinao Takemae, Hiroshi Murase, "Conversation Scene Analysis with Dynamic Bayesian Network Basedon Visual Head Tracking," icme, pp.949-952, 2006 IEEE International Conference on Multimedia and Expo, 2006
Usage of this product signifies your acceptance of the Terms of Use.