Issue No. 10 - October (2009 vol. 31)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2009.43
Yang Wang , Simon Fraser University, Burnaby
Greg Mori , Simon Fraser University, Burnaby
We propose two new models for human action recognition from video sequences using topic models. Video sequences are represented by a novel “bag-of-words” representation, where each frame corresponds to a “word.” Our models differ from previous latent topic models for visual recognition in two major aspects: first of all, the latent topics in our models directly correspond to class labels; second, some of the latent variables in previous topic models become observed in our case. Our models have several advantages over other latent topic models used in visual recognition. First of all, the training is much easier due to the decoupling of the model parameters. Second, it alleviates the issue of how to choose the appropriate number of latent topics. Third, it achieves much better performance by utilizing the information provided by the class labels in the training set. We present action classification results on five different data sets. Our results are either comparable to, or significantly better than previously published results on these data sets.
Human action recognition, video analysis, bag-of-words, probabilistic graphical models, event and activity understanding
Y. Wang and G. Mori, "Human Action Recognition by Semilatent Topic Models," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 31, no. , pp. 1762-1774, 2009.