The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2013 vol.35)
pp: 2751-2764
V. Badrinarayanan , Dept. of Eng., Univ. of Cambridge, Cambridge, UK
I. Budvytis , Dept. of Eng., Univ. of Cambridge, Cambridge, UK
R. Cipolla , Dept. of Eng., Univ. of Cambridge, Cambridge, UK
ABSTRACT
We present a novel patch-based probabilistic graphical model for semi-supervised video segmentation. At the heart of our model is a temporal tree structure that links patches in adjacent frames through the video sequence. This permits exact inference of pixel labels without resorting to traditional short time window-based video processing or instantaneous decision making. The input to our algorithm is labeled key frame(s) of a video sequence and the output is pixel-wise labels along with their confidences. We propose an efficient inference scheme that performs exact inference over the temporal tree, and optionally a per frame label smoothing step using loopy BP, to estimate pixel-wise labels and their posteriors. These posteriors are used to learn pixel unaries by training a Random Decision Forest in a semi-supervised manner. These unaries are used in a second iteration of label inference to improve the segmentation quality. We demonstrate the efficacy of our proposed algorithm using several qualitative and quantitative tests on both foreground/background and multiclass video segmentation problems using publicly available and our own datasets.
INDEX TERMS
Image segmentation, Vegetation, Graphical models, Computational modeling, Video sequences, Probabilistic logic, Inference algorithms,structured variational inference, Semi-supervised video segmentation, label propagation, mixture of trees graphical model, tree-structured video models
CITATION
V. Badrinarayanan, I. Budvytis, R. Cipolla, "Semi-Supervised Video Segmentation Using Tree Structured Graphical Models", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 11, pp. 2751-2764, Nov. 2013, doi:10.1109/TPAMI.2013.54
REFERENCES
[1] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, "Learning Realistic Human Actions from Movies," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[2] E.B. Sudderth and M.I. Jordan, "Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes," Proc. 22nd Ann. Conf. Neural Information Processing Systems, pp. 1585-1592, 2008.
[3] X. Bai, J. Wang, D. Simons, and G. Sapiro, "Video SnapCut: Robust Video Object Cutout Using Localized Classifiers," Proc. ACM Siggraph, pp. 70:1-70:11, 2009.
[4] Y. Li, J. Sun, and H.-Y. Shum, "Video Object Cut and Paste," ACM Trans. Graphics, vol. 24, pp. 595-600, 2005.
[5] C. Rother, V. Kolmogorov, and A. Blake, "'GrabCut': Interactive Foreground Extraction Using Iterated Graph Cuts," ACM Trans. Graphics, vol. 23, pp. 309-314, 2004.
[6] A. Vezhnevets, V. Ferrari, and J.M. Buhmann, "Weakly Supervised Structured Output Learning for Semantic Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2012.
[7] A. Vazquez-Reina, S. Avidan, H. Pfister, and E. Miller, "Multiple Hypothesis Video Segmentation from Superpixel Flows," Proc. 11th European Conf. Computer Vision: Part V, 2010.
[8] J. Lezama, K. Alahari, J. Sivic, and I. Laptev, "Track to the Future: Spatio-Temporal Video Segmentation with Long-Range Motion Cues," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[9] Y.J. Lee, J. Kim, and K. Grauman, "Key-Segments for Video Object Segmentation," Proc. IEEE Int'l Conf. Computer Vision, 2011.
[10] A. Fathi, M. Balcan, X. Ren, and J.M. Rehg, "Combining Self Training and Active Learning for Video Segmentation," Proc. 22nd British Machine Vision Conf., 2011.
[11] Y. Boykov and M.P. Jolly, "Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in n-d Images," Proc. Eighth IEEE Int'l Conf. Computer Vision, 2001.
[12] P. Kohli and P. Torr, "Efficiently Solving Dynamic Markov Random Fields Using Graph Cuts," Proc. 10th IEEE Int'l Conf. Computer Vision, pp. II: 922-929, 2005.
[13] D. Tsai, M. Flagg, and J.M. Rehg, "Motion Coherent Tracking with Multi-Label MRF Optimization," Proc. British Machine Vision Conf., 2010.
[14] J. Shotton, M. Johnson, and R. Cipolla, "Semantic Texton Forests for Image Categorization and Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[15] R. Yan, J. Yang, and A. Hauptmann, "Automatically Labeling Video Data Using Multi-Class Active Learning," Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 516-523, 2003.
[16] B. Settles, "Active Learning Literature Survey," Computer Sciences Technical Report 1648, Univ. of Wisconsin–Madison, 2010.
[17] X. Zhu and Z. Ghahramani, "Learning from Labeled and Unlabeled Data with Label Propagation," Technical Report CMU-CALD-02-107, Carnegie Mellon Univ., 2002.
[18] N. Jojic, B.J. Frey, and A. Kannan, "Epitomic Analysis of Appearance and Shape," Proc. Ninth IEEE Int'l Conf. Computer Vision, 2003.
[19] V. Cheung, B.J. Frey, and N. Jojic, "Video Epitomes," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[20] A. Kannan, J. Winn, and C. Rother, "Clustering Appearance and Shape by Learning Jigsaws," Proc. 20th Ann. Conf. Neural Information Processing Systems, vol. 19, 2006.
[21] L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.
[22] M. Grundmann, V. Kwatra, M. Han, and I. Essa, "Efficient Hierarchical Graph-Based Video Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[23] T. Brox and J. Malik, "Object Segmentation by Long Term Analysis of Point Trajectories," Proc. 11th European Conf. Computer Vision: Part V, 2010.
[24] I. Budvytis, V. Badrinarayanan, and R. Cipolla, "Semi-Supervised Video Segmentation Using Tree Structured Graphical Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[25] G. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla, "Segmentation and Recognition Using Structure from Motion Point Clouds," Proc. 10th European Conf. Computer Vision: Part I, 2008.
[26] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph Cuts," Proc. Seventh IEEE Int'l Conf. Computer Vision, 1999.
[27] V. Badrinarayanan, F. Galasso, and R. Cipolla, "Label Propagation in Video Sequences," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[28] G.E. Hinton, "Learning to Represent Visual Input," Philosphical Trans. Royal Soc., B., vol. 365, pp. 177-184, 2010.
[29] C. Wang, M. Gorce, and N. Paragios, "Segmentation, Ordering and Multi-Object Tracking Using Graphical Models," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[30] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[31] L.K. Saul and M.I. Jordan, "Exploiting Tractable Substructures in Intractable Networks," Proc. Conf. Neural Information Processing Systems, 1996.
[32] A. Criminisi, J. Shotton, and E. Konukoglu, "Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning," Foundations and Trends in Computer Graphics and Vision, vol. 7, nos. 2/3, pp. 81-227, 2012.
[33] A. Ayvaci, M. Raptis, and S. Soatto, "Sparse Occlusion Detection with Optical Flow," Int'l J. Computer Vision, vol. 97, pp. 322-338, 2011.
[34] Y. Chuang, A. Agarwala, B. Curless, D.H. Salesin, and R. Szeliski, "Video Matting of Complex Scenes," Proc. ACM Siggraph, vol. 21, no. 3, pp. 243-248, 2002.
[35] A.Y.C. Chen and J.J. Corso, "Propagating Multi-Class Pixel Labels Throughout Video Frames," Proc. Western New York Image Processing Workshop, 2010.
[36] G. Brostow, J. Fauqueur, and R. Cipolla, "Semantic Object Classes in Video: A High-Definition Ground Truth Database," Pattern Recognition Letters, vol. 30, no. 2, pp. 88-97, 2009.
[37] I. Budvytis, V. Badrinarayanan, and R. Cipolla, "Label Propagation in Complex Video Sequences Using Semi-Supervised Learning," Proc. British Machine Vision Conf., 2010.
[38] P. Chockalingam, N. Pradeep, and S. Birchfield, "Adaptive Fragments-Based Tracking of Non-Rigid Objects Using Level Sets," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
87 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool