The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.07 - July (2003 vol.25)
pp: 828-836
Nebojsa Jojic , IEEE Computer Society
ABSTRACT
<p><b>Abstract</b>—We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.</p>
INDEX TERMS
Audio, video, audiovisual, graphical models, generative models, probabilistic inference, Bayesian inference, variational methods, expectation-maximization (EM) algorithm, multimodal, multimedia, tracking, speaker modeling, speech, vision, microphone arrays, cameras, automatic calibrations.
CITATION
Matthew J. Beal, Nebojsa Jojic, Hagai Attias, "A Graphical Model for Audiovisual Object Tracking", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.25, no. 7, pp. 828-836, July 2003, doi:10.1109/TPAMI.2003.1206512
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool