Subscribe
Issue No.05 - May (2013 vol.35)
pp: 1066-1079
N. Payet , Sch. of Electr. Eng. & Comput. Sci., Oregon State Univ., Corvallis, OR, USA
S. Todorovic , Sch. of Electr. Eng. & Comput. Sci., Oregon State Univ., Corvallis, OR, USA
ABSTRACT
This paper presents a new computational framework for detecting and segmenting object occurrences in images. We combine Hough forest (HF) and conditional random field (CRF) into HFRF to assign labels of object classes to image regions. HF captures intrinsic and contextual properties of objects. CRF then fuses the labeling hypotheses generated by HF for identifying every object occurrence. Interaction between HF and CRF happens in HFRF inference, which uses the Metropolis-Hastings algorithm. The Metropolis-Hastings reversible jumps depend on two ratios of proposal and posterior distributions. Instead of estimating four distributions, we directly compute the two ratios using HF. In leaf nodes, HF records class histograms of training examples and information about their configurations. This evidence is used in inference for nonparametric estimation of the two distribution ratios. Our empirical evaluation on benchmark datasets demonstrates higher average precision rates of object detection, smaller object segmentation error, and faster convergence rates of our inference, relative to the state of the art. The paper also presents theoretical error bounds of HF and HFRF applied to a two-class object detection and segmentation.
INDEX TERMS
Training, Hafnium, Image segmentation, Object recognition, Vegetation, Proposals, Image edge detection, Metropolis-Hastings algorithm, Object recognition and segmentation, conditional random field, Hough forest
CITATION
N. Payet, S. Todorovic, "Hough Forest Random Field for Object Recognition and Segmentation", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 5, pp. 1066-1079, May 2013, doi:10.1109/TPAMI.2012.194
REFERENCES
 [1] L.-J. Li, R. Socher, and L. Fei-Fei, "Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009. [2] X. He, R.S. Zemel, and M.Á. Carreira-Perpiñán, "Multiscale Conditional Random Fields for Image Labeling," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 695-702, 2004. [3] J. Shotton, J. Winn, C. Rother, and A. Criminisi, "Textonboost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context," Int'l J. Computer Vision, vol. 81, pp. 2-23, 2007. [4] J. Verbeek and B. Triggs, "Scene Segmentation with CRFs Learned from Partially Labeled Images," Proc. Advances Neural Information Processing Systems, pp. 1553-1560, 2008. [5] A.B. Torralba, K.P. Murphy, and W.T. Freeman, "Contextual Models for Object Detection Using Boosted Random Fields," Proc. Advances Neural Information Processing Systems, 2004. [6] S. Gould, T. Gao, and D. Koller, "Region-Based Segmentation and Object Detection," Proc. Advances Neural Information Processing Systems, 2009. [7] A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie, "Objects in Context," Proc. 11th IEEE Int'l Conf. Computer Vision, 2007. [8] N. Payet and S. Todorovic, "From a Set of Shapes to Object Discovery," Proc. 11th European Conf. Computer Vision, 2010. [9] S. Todorovic and N. Ahuja, "Unsupervised Category Modeling, Recognition, and Segmentation in Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 12, pp. 1-17, Dec. 2008. [10] J.J. Lim, P. Arbelaez, C. Gu, and J. Malik, "Context by Region Ancestry," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009. [11] J. Sivic, B.C. Russell, A. Zisserman, W.T. Freeman, and A.A. Efros, "Unsupervised Discovery of Visual Object Class Hierarchies," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008. [12] J. Lafferty, A. McCallum, and F. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data," Proc. Int'l Conf. Machine Learning, pp. 282-289, 2001. [13] W.K. Hastings, "Monte Carlo Sampling Methods Using Markov Chains and Their Applications," Biometrika, vol. 57, no. 1, pp. 97-109, 1970. [14] A. Barbu and S.-C. Zhu, "Generalizing Swendsen-Wang to Sampling Arbitrary Posterior Probabilities," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1239-1253, Aug. 2005. [15] J. Gall and V. Lempitsky, "Class-Specific Hough Forests for Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009. [16] J.M. Winn and J. Shotton, "The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 37-44, 2006. [17] P.F. Felzenszwalb and D.P. Huttenlocher, "Efficient Belief Propagation for Early Vision," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 261-268, 2004. [18] V. Kolmogorov, "Convergent Tree-Reweighted Message Passing for Energy Minimization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1568-1583, Oct. 2006. [19] N. Komodakis and G. Tziritas, "Approximate Labeling Via Graph-Cuts Based on Linear Programming," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 8, pp. 1436-1453, Aug. 2007. [20] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, "A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 1068-1080, June 2008. [21] D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. [22] M.J. Wainwright, T.S. Jaakkola, and A.S. Willsky, "MAP Estimation via Agreement on Trees: Message-Passing and Linear Programming," IEEE Trans. Information Theory, vol. 51, no. 11, pp. 3697-3717, Oct. 2005. [23] J. Pearl, Probabilistic Reasoning in Intelligence Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, 1988. [24] V. Kolmogorov and R. Zabin, "What Energy Functions Can Be Minimized via Graph Cuts?" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 147-159, Feb. 2004. [25] E. Boros and P.L. Hammer, "Pseudo-Boolean Optimization," Discrete Applied Math., vol. 123, pp. 155-225, 2002. [26] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph Cuts," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001. [27] M. Szummer, P. Kohli, and D. Hoiem, "Learning CRFs Using Graph Cuts," Proc. European Conf. Computer Vision, pp. 582-595, 2008. [28] N. Komodakis and N. Paragios, "Beyond Loose LP-Relaxations: Optimizing MRFs by Repairing Cycles," Proc. European Conf. Computer Vision, pp. 806-820, 2008. [29] M.P. Kumar and P.H.S. Torr, "Efficiently Solving Convex Relaxations for MAP Estimation," Proc. Int'l Conf. Machine Learning, pp. 680-687, 2008. [30] D. Sontag, T. Meltzer, A. Globerson, T. Jaakkola, and Y. Weiss, "Tightening LP Relaxations for MAP Using Message Passing," Proc. 24th Ann. Conf. Uncertainty in Artificial Intelligence, 2008. [31] J.S. Yedidia, W.T. Freeman, and Y. Weiss, Understanding Belief Propagation and Its Generalizations, pp. 239-269. Morgan Kaufmann Publishers, Inc., 2003. [32] M.J. Beal, "Variational Algorithms for Approximate Bayesian Inference," PhD dissertation, Gatsby Computational Neuroscience Unit, Univ. College London, 2003. [33] D.P. Bertsekas, Nonlinear Programming, second ed. Athena Scientific, Sept. 1999. [34] N. Komodakis, G. Tziritas, and N. Paragios, "Performance vs Computational Efficiency for Optimizing Single and Dynamic MRFs: Setting the State of the Art with Primal-Dual Strategies," Computer Visual Image Understanding, vol. 112, no. 1, pp. 14-29, 2008. [35] M.P. Kumar and D. Koller, "Efficiently Selecting Regions for Scene Understanding," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010. [36] C. Rother, V. Kolmogorov, V. Lempitsky, and M. Szummer, "Optimizing Binary MRFs via Extended Roof Duality," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007. [37] D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, and A. Ng, "Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 169-176, 2005. [38] S. Kumar, J. August, and M. Hebert, "Exploiting Inference for Approximate Parameter Learning in Discriminative Fields: An Empirical Study," Proc. Energy Minimization Methods in Computer Vision and Pattern Recognition, 2005. [39] L. Zhang and S.M. Seitz, "Parameter Estimation for MRF Stereo," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005. [40] C. Desai, D. Ramanan, and C. Fowlkes, "Discriminative Models for Multi-Class Object Layout," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009. [41] R. Marée, P. Geurts, J. Piater, and L. Wehenkel, "Random Subwindows for Robust Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 34-40, 2005. [42] F. Moosmann, B. Triggs, and F. Jurie, "Fast Discriminative Visual Codebooks Using Randomized Clustering Forests," Proc. Neural Information Processing Systems, pp. 985-992, 2007. [43] F. Schroff, A. Criminisi, and A. Zisserman, "Object Class Segmentation Using Random Forests," Proc. British Machine Vision Conf., 2008. [44] A. Bosch, A. Zisserman, and X. Munoz, "Image Classification Using Random Forests and Ferns," Proc. 11th IEEE Int'l Conf. Computer Vision Conf., 2007. [45] J. Shotton, M. Johnson, and R. Cipolla, "Semantic Texton Forests for Image Categorization and Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008. [46] S. Nowozin, C. Rother, S. Bagon, T. Sharp, B. Yao, and P. Kholi, "Decision Tree Fields," Proc. 11th IEEE Int'l Conf. Computer Vision Conf., Nov. 2011. [47] G. Martinez, W. Zhang, N. Payet, S. Todorovic, N. Larios, A. Yamamuro, D. Lytle, A. Moldenke, E. Mortensen, R. Paasch, L. Shapiro, and T. Dietterich, "Dictionary-Free Categorization of Very Similar Objects via Stacked Evidence Trees," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009. [48] N. Payet and S. Todorovic, "${\rm (RF)}^2$ —Random Forest Random Field," Proc. Neural Information Processing Systems, 2010. [49] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, "From Contours to Regions: An Empirical Evaluation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009. [50] L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001. [51] S. Bileschi and L. Wolf, "A Unified System for Object Detection, Texture Recognition, and Context Analysis Based on the Standard Model Feature Set," Proc. British Machine Vision Conf., 2005. [52] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes," www.pascal-network.org/challengesVOC/, 2012. [53] F. Li, J. Carreira, and C. Sminchisescu, "Object Recognition as Ranking Holistic Figure-Ground Hypotheses," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010. [54] C. Russell, P.H.S. Torr, and P. Kohli, "Associative Hierarchical CRFs for Object Class Image Segmentation," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009. [55] J. Gonfaus, X. Boix, J. van de Weijer, A. Bagdanov, J. Serrat, and J. Gonzalez, "Harmony Potentials for Joint Classification and Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3280-3287, 2010. [56] S. Gould, R. Fulton, and D. Koller, "Decomposing a Scene into Geometric and Semantically Consistent Regions," Proc. 12th IEEE Int'l Conf. Computer Vision, pp. 1-8, 2009. [57] C. Galleguillos, B. McFee, S. Belongie, and G.R.G. Lanckriet, "Multi-Class Object Localization by Combining Local Contextual Interactions," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010. [58] P.F. Felzenszwalb, R.B. Girshick, D. McAllester, and D. Ramanan, "Object Detection with Discriminatively Trained Part-Based Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, Sept. 2010. [59] L. Zhu, Y. Chen, A.L. Yuille, and W.T. Freeman, "Latent Hierarchical Structural Learning for Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010. [60] M. Pedersoli, A. Vedaldi, and J. Gonzàlez, "A Coarse-to-Fine Approach for Fast Deformable Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011. [61] N.N. Schraudolph, J. Yu, and S. Günter, "A Stochastic Quasi-Newton Method for Online Convex Optimization," Proc. Int'l Conf. Artificial Intelligence and Statistics, vol. 2, pp. 436-443, 2007. [62] Y. Lin and Y. Jeon, "Random Forests and Adaptive Nearest Neighbors," J. Am. Statistical Assoc., vol. 101, pp. 578-590, 2006. [63] I.S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series and Products, fifth ed. Academic Press, Inc., 2007. [64] Z.A. Lomnicki, "On the Distribution of Products of Random Variables," J. Royal Statistical Soc., Series B (Methodological), vol. 29, no. 3, pp. 513-524, 1967.