The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - June (2009 vol.31)
pp: 974-988
Guofeng Zhang , Zhejiang University, Hangzhou
Jiaya Jia , The Chinese University of Hong Kong, Hong Kong
Tien-Tsin Wong , The Chinese University of Hong Kong, Hong Kong
Hujun Bao , Zhejiang University, Hangzhou
ABSTRACT
This paper presents a novel method for recovering consistent depth maps from a video sequence. We propose a bundle optimization framework to address the major difficulties in stereo reconstruction, such as dealing with image noise, occlusions, and outliers. Different from the typical multi-view stereo methods, our approach not only imposes the photo-consistency constraint, but also explicitly associates the geometric coherence with multiple frames in a statistical way. It thus can naturally maintain the temporal coherence of the recovered dense depth maps without over-smoothing. To make the inference tractable, we introduce an iterative optimization scheme by first initializing the disparity maps using a segmentation prior and then refining the disparities by means of bundle optimization. Instead of defining the visibility parameters, our method implicitly models the reconstruction noise as well as the probabilistic visibility. After bundle optimization, we introduce an efficient space-time fusion algorithm to further reduce the reconstruction noise. Our automatic depth recovery is evaluated using a variety of challenging video examples.
INDEX TERMS
Stereo, Motion, Depth cues
CITATION
Guofeng Zhang, Jiaya Jia, Tien-Tsin Wong, Hujun Bao, "Consistent Depth Maps Recovery from a Video Sequence", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 6, pp. 974-988, June 2009, doi:10.1109/TPAMI.2009.52
REFERENCES
[1] L. Álvarez, R. Deriche, T. Papadopoulo, and J. Sánchez, “Symmetrical Dense Optical Flow Estimation with Occlusions Detection,” Int'l J. Computer Vision, vol. 75, no. 3, pp.371-385, 2007.
[2] P. Bhat, C.L. Zitnick, N. Snavely, A. Agarwala, M. Agrawala, B. Curless, M. Cohen, and S.B. Kang, “Using Photographs to Enhance Videos of a Static Scene,” Rendering Techniques 2007: Proc. Eurographics Symp. Rendering, J. Kautz and S. Pattanaik, eds., pp.327-338, June 2007.
[3] A.F. Bobick and S.S. Intille, “Large Occlusion Stereo,” Int'l J. Computer Vision, vol. 33, no. 3, pp.181-200, 1999.
[4] Y. Boykov, O. Veksler, and R. Zabih, “Fast Approximate Energy Minimization via Graph Cuts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp.1222-1239, Nov. 2001.
[5] D. Bradley, T. Boubekeur, and W. Heidrich, “Accurate Multi-View Reconstruction Using Robust Binocular Stereo and Surface Meshing,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[6] R.T. Collins, “A Space-Sweep Approach to True Multi-Image Matching,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp.358-363, 1996.
[7] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach Toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp.603-619, May 2002.
[8] Y. Deng, Q. Yang, X. Lin, and X. Tang, “A Symmetric Patch-Based Correspondence Model for Occlusion Handling,” Proc. IEEE Int'l Conf. Computer Vision, pp.1316-1322, 2005.
[9] O.D. Faugeras and R. Keriven, “Variational Principles, Surface Evolution, PDEs, Level Set Methods, and the Stereo Problem,” IEEE Trans. Image Processing, vol. 7, no. 3, pp.336-344, 1998.
[10] P.F. Felzenszwalb and D.P. Huttenlocher, “Efficient Belief Propagation for Early Vision,” Int'l J. Computer Vision, vol. 70, no. 1, pp.41-54, 2006.
[11] P. Fua, “Aparallel Stereo Algorithm that Produces Dense Depth Maps and Preserves Image Features,” Machine Vision and Applications, vol. 6, pp.35-49, 1993.
[12] Y. Furukawa and J. Ponce, “Dense 3D Motion Capture from Synchronized Video Streams,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[13] D. Gallup, J.-M. Frahm, P. Mordohai, Q. Yang, and M. Pollefeys, “Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2007.
[14] D. Gallup, J.-M.F.P. Mordohai, and M. Pollefeys, “Variable Baseline/Resolution Stereo,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[15] P. Gargallo and P.F. Sturm, “Bayesian 3D Modeling from Images Using Multiple Depth Maps,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.885-891, 2005.
[16] M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S.M. Seitz, “Multi-View Stereo for Community Photo Collections,” Proc. IEEE Int'l Conf. Computer Vision, 2007.
[17] R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ. Press, 2004.
[18] C. Hernández, G. Vogiatzis, and R. Cipolla, “Probabilistic Visibility for Multi-View Stereo,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2007.
[19] S.B. Kang and R. Szeliski, “Extracting View-Dependent Depth Maps from a Collection of Images,” Int'l J. Computer Vision, vol. 58, no. 2, pp.139-163, 2004.
[20] S.B. Kang, R. Szeliski, and J. Chai, “Handling Occlusions in Dense Multi-View Stereo,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp.103-110, 2001.
[21] A. Klaus, M. Sormann, and K.F. Karner, “Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting Dissimilarity Measure,” Proc. Int'l Conf. Pattern Recognition, vol. 3, pp.15-18, 2006.
[22] V. Kolmogorov and R. Zabih, “Computing Visual Correspondence with Occlusions Via Graph Cuts,” Proc. IEEE Int'l Conf. Computer Vision, pp.508-515, 2001.
[23] V. Kolmogorov and R. Zabih, “What Energy Functions Can Be Minimized via Graph Cuts?” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp.147-159, Feb. 2004.
[24] E.S. Larsen, P. Mordohai, M. Pollefeys, and H. Fuchs, “Temporally Consistent Reconstruction from Multiple Video Streams Using Enhanced Belief Propagation,” Proc. IEEE Int'l Conf. Computer Vision, pp.1-8, 2007.
[25] A. Laurentini, “The Visual Hull Concept for Silhouette-Based Image Understanding,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 2, pp.150-162, Feb. 1994.
[26] G. Li and S.W. Zucker, “Surface Geometric Constraints for Stereo in Belief Propagation,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.2355-2362, 2006.
[27] P. Merrell, A. Akbarzadeh, L. Wang, P. Mordohai, J.-M. Frahm, R. Yang, D. Nistér, and M. Pollefeys, “Real-Time Visibility-Based Fusion of Depth Maps,” Proc. IEEE Int'l Conf. Computer Vision, 2007.
[28] M. Okutomi and T. Kanade, “A Multiple-Baseline Stereo,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 4, pp.353-363, Apr. 1993.
[29] M. Pollefeys, L.J. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual Modeling with a Hand-Held Camera,” Int'l J. Computer Vision, vol. 59, no. 3, pp.207-232, 2004.
[30] M. Proesmans, L.J. Van Gool, E.J. Pauwels, and A. Oosterlinck, “Determination of Optical Flow and its Discontinuities Using Non-Linear Diffusion,” Proc. European Conf. Computer Vision, vol. 2, pp.295-304, 1994.
[31] D. Scharstein and R. Szeliski, “A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,” Int'l J. Computer Vision, vol. 47, nos.1-3, pp.7-42, 2002.
[32] S.M. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski, “A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp.519-528, 2006.
[33] S.M. Seitz and C.R. Dyer, “Photorealistic Scene Reconstruction by Voxel Coloring,” Int'l J. Computer Vision, vol. 35, no. 2, pp.151-173, 1999.
[34] C. Strecha, W. von Hansen, L. Van Gool, P. Fua, and U. Thoennessen, “On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[35] C. Strecha, R. Fransens, and L.J. Van Gool, “Wide Baseline Stereo from Multiple Views: A Probabilistic Account,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp.552-559, 2004.
[36] C. Strecha, R. Fransens, and L.J. Van Gool, “Combined Depth and Outlier Estimation in Multi-View Stereo,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.2394-2401, 2006.
[37] C. Strecha and L.J. Van Gool, “PDE-Based Multi-View Depth Estimation,” Proc. 3D Data Processing Visualization and Transmission, pp.416-427, 2002.
[38] J. Sun, Y. Li, and S.B. Kang, “Symmetric Stereo Matching for Occlusion Handling,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.399-406, 2005.
[39] J. Sun, N.-N. Zheng, and H.-Y. Shum, “Stereo Matching Using Belief Propagation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp.787-800, July 2003.
[40] Y. Taguchi, B. Wilburn, and L. Zitnick, “Stereo Reconstruction with Mixed Pixels Using Adaptive Over-Segmentation,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[41] H. Tao, H.S. Sawhney, and R. Kumar, “Dynamic Depth Recovery from Multiple Synchronized Video Streams,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp.118-124, 2001.
[42] H. Tao, H.S. Sawhney, and R. Kumar, “A Global Matching Framework for Stereo Computation,” Proc. IEEE Int'l Conf. Computer Vision, pp.532-539, 2001.
[43] M.F. Tappen and W.T. Freeman, “Comparison of Graph Cuts with Belief Propagation for Stereo, Using Identical MRF Parameters,” Proc. IEEE Int'l Conf. Computer Vision, pp.900-907, 2003.
[44] D. Terzopoulos, “Multilevel Computational Processes for Visual Surface Reconstruction,” Computer Vision, Graphics, and Image Processing, vol. 24, no. 1, pp.52-96, 1983.
[45] G. Vogiatzis, P.H.S. Torr, and R. Cipolla, “Multi-View Stereo Via Volumetric Graph-Cuts,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.391-398, 2005.
[46] O.J. Woodfordy, P.H.S. Torrz, I.D. Reidy, and A.W. Fitzgibbon, “Global Stereo Reconstruction under Second Order Smoothness Priors,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2008.
[47] Q. Yang, L. Wang, R. Yang, H. Stewénius, and D. Nistér, “Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation and Occlusion Handling,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.2347-2354, 2006.
[48] C. Zach, T. Pock, and H. Bischof, “A Globally Optimal Algorithm for Robust TV-$L^1$ Range Image Integration,” Proc. IEEE Int'l Conf. Computer Vision, pp.1-8, 2007.
[49] A. Zaharescu, E. Boyer, and R. Horaud, “Transformesh: A Topology-Adaptive Mesh-Based Approach to Surface Evolution,” Proc. Asian Conf. Computer Vision, vol. 2, pp.166-175, 2007.
[50] G. Zhang, X. Qin, W. Hua, T.-T. Wong, P.-A. Heng, and H. Bao, “Robust Metric Reconstruction from Challenging Video Sequences,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2007.
[51] C. Zhou and H. Tao, “Dynamic Depth Recovery from Unsynchronized Video Streams,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp.351-358, 2003.
[52] C.L. Zitnick and S.B. Kang, “Stereo for Image-Based Rendering Using Image Over-Segmentation,” Int'l J. Computer Vision, vol. 75, no. 1, pp.49-65, 2007.
45 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool