loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
18th International Conference on Pattern Recognition (ICPR'06) Volume 1
Audio-Visual Speaker Localization Using Graphical Models
Hong Kong
August 20-August 24
ISBN: 0-7695-2521-0
Akash Kushal, University of Illinois, Urbana Champaign
Mandar Rahurkar, University of Illinois, Urbana Champaign
Li Fei-Fei, University of Illinois, Urbana Champaign
Jean Ponce, University of Illinois, Urbana Champaign
Thomas Huang, University of Illinois, Urbana Champaign
In this work we propose an approach to combine audio and video modalities for person tracking using graphical models. We demonstrate a principled and intuitive framework for combining these modalities to obtain robustness against occlusion and change in appearance. We further exploit the temporal correlations that exist for a moving object between adjacent frames to account for the cases where having both modalities might still not be enough, e.g., when the person being tracked is occluded and not speaking. Improvement in tracking results is shown at each step and compared with manually annotated ground truth.
Citation:
Akash Kushal, Mandar Rahurkar, Li Fei-Fei, Jean Ponce, Thomas Huang, "Audio-Visual Speaker Localization Using Graphical Models," icpr, vol. 1, pp.291-294, 18th International Conference on Pattern Recognition (ICPR'06) Volume 1, 2006
Usage of this product signifies your acceptance of the Terms of Use.