The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - September/October (2009 vol.15)
pp: 828-840
Guofeng Zhang , Zhejiang University, Hangzhou
Zilong Dong , Zhejiang University, Hangzhou
Jiaya Jia , The Chinese University of Hong Kong, Hong Kong
Liang Wan , The Chinese University of Hong Kong, Hong Kong
Tien-Tsin Wong , The Chinese University of Hong Kong, Hong Kong
Hujun Bao , Zhejiang University, Hangzhou
ABSTRACT
Compared to still image editing, content-based video editing faces the additional challenges of maintaining the spatiotemporal consistency with respect to geometry. This brings up difficulties of seamlessly modifying video content, for instance, inserting or removing an object. In this paper, we present a new video editing system for creating spatiotemporally consistent and visually appealing refilming effects. Unlike the typical filming practice, our system requires no labor-intensive construction of 3D models/surfaces mimicking the real scene. Instead, it is based on an unsupervised inference of view-dependent depth maps for all video frames. We provide interactive tools requiring only a small amount of user input to perform elementary video content editing, such as separating video layers, completing background scene, and extracting moving objects. These tools can be utilized to produce a variety of visual effects in our system, including but not limited to video composition, "predator” effect, bullet-time, depth-of-field, and fog synthesis. Some of the effects can be achieved in real time.
INDEX TERMS
Video editing, refilming, depth estimation, composition, background completion, layer separation.
CITATION
Guofeng Zhang, Zilong Dong, Jiaya Jia, Liang Wan, Tien-Tsin Wong, Hujun Bao, "Refilming with Depth-Inferred Videos", IEEE Transactions on Visualization & Computer Graphics, vol.15, no. 5, pp. 828-840, September/October 2009, doi:10.1109/TVCG.2009.47
REFERENCES
[1] A. van den Hengel, A.R. Dick, T. Thormählen, B. Ward, and P.H.S. Torr, “Videotrace: Rapid Interactive Scene Modelling from Video,” ACM Trans. Graphics, vol. 26, no. 3, 2007.
[2] G. Zhang, J. Jia, T.-T. Wong, and H. Bao, “Recovering Consistent Video Depth Maps via Bundle Optimization,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2008.
[3] Y.-Y. Chuang, B. Curless, D. Salesin, and R. Szeliski, “A Bayesian Approach to Digital Matting,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 264-271, 2001.
[4] Y.-Y. Chuang, A. Agarwala, B. Curless, D. Salesin, and R. Szeliski, “Video Matting of Complex Scenes,” ACM Trans. Graphics, vol. 21, no. 3, pp. 243-248, 2002.
[5] C. Rother, V. Kolmogorov, and A. Blake, ““Grabcut”: Interactive Foreground Extraction Using Iterated Graph Cuts,” ACM Trans. Graphics, vol. 23, no. 3, pp. 309-314, 2004.
[6] Y. Li, J. Sun, C.-K. Tang, and H.-Y. Shum, “Lazy Snapping,” ACM Trans. Graphics, vol. 23, no. 3, pp. 303-308, 2004.
[7] A. Levin, D. Lischinski, and Y. Weiss, “A Closed Form Solution to Natural Image Matting,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 61-68, 2006.
[8] J. Wang and M.F. Cohen, “Optimized Color Sampling for Robust Matting,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[9] J. Wang, P. Bhat, A. Colburn, M. Agrawala, and M.F. Cohen, “Interactive Video Cutout,” ACM Trans. Graphics, vol. 24, no. 3, pp. 585-594, 2005.
[10] Y. Li, J. Sun, and H.-Y. Shum, “Video Object Cut and Paste,” ACM Trans. Graphics, vol. 24, no. 3, pp. 595-600, 2005.
[11] X. Bai and G. Sapiro, “A Geodesic Framework for Fast Interactive Image and Video Segmentation and Matting,” Proc. IEEE Int'l Conf. Computer Vision (ICCV), 2007.
[12] P. Sand and S.J. Teller, “Video Matching,” ACM Trans. Graphics, vol. 23, no. 3, pp. 592-599, 2004.
[13] J. Xiao, X. Cao, and H. Foroosh, “3D Object Transfer between Non-Overlapping Videos,” Proc. IEEE Virtual Reality Conf., 2006.
[14] R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ. Press, 2004.
[15] M. Pollefeys, L.J.V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, “Visual Modeling with a Hand-Held Camera.” Int'l J. Computer Vision, vol. 59, no. 3, pp. 207-232, 2004.
[16] G. Zhang, X. Qin, W. Hua, T.-T. Wong, P.-A. Heng, and H. Bao, “Robust Metric Reconstruction from Challenging Video Sequences,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[17] J. Sun, N.-N. Zheng, and H.-Y. Shum, “Stereo Matching Using Belief Propagation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 787-800, July 2003.
[18] C.L. Zitnick, S.B. Kang, M. Uyttendaele, S.A.J. Winder, and R. Szeliski, “High-Quality Video View Interpolation Using a Layered Representation,” ACM Trans. Graphics, vol. 23, no. 3, pp. 600-608, 2004.
[19] S.M. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski, “A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 519-528, 2006.
[20] C.L. Zitnick and S.B. Kang, “Stereo for Image-Based Rendering Using Image Over-Segmentation,” Int'l J. Computer Vision, vol. 75, no. 1, pp. 49-65, 2007.
[21] M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S.M. Seitz, “Multi-View Stereo for Community Photo Collections,” Proc. IEEE Int'l Conf. Computer Vision (ICCV), 2007.
[22] S.B. Kang and R. Szeliski, “Extracting View-Dependent Depth Maps from a Collection of Images,” Int'l J. Computer Vision, vol. 58, no. 2, pp. 139-163, 2004.
[23] P. Bhat, C.L. Zitnick, N. Snavely, A. Agarwala, M. Agrawala, B. Curless, M. Cohen, and S.B. Kang, “Using Photographs to Enhance Videos of a Static Scene,” Proc. 18th Eurographics Symp. Rendering: Rendering Techniques, pp. 327-338, June 2007.
[24] P.F. Felzenszwalb and D.P. Huttenlocher, “Efficient Belief Propagation for Early Vision,” Int'l J. Computer Vision, vol. 70, no. 1, pp. 41-54, 2006.
[25] S.B. Kang, R. Szeliski, and J. Chai, “Handling Occlusions in Dense Multi-View Stereo,” Technical Report MSR-TR-2001-80, Microsoft Corporation, Sept. 2001.
[26] A. Agarwala, M. Dontcheva, M. Agrawala, S.M. Drucker, A. Colburn, B. Curless, D. Salesin, and M.F. Cohen, “Interactive Digital Photomontage,” ACM Trans. Graphics, vol. 23, no. 3, pp.294-302, 2004.
[27] J.-F. Lalonde, D. Hoiem, A.A. Efros, C. Rother, J.M. Winn, and A. Criminisi, “Photo Clip Art,” ACM Trans. Graphics, vol. 26, no. 3, p.3, 2007.
[28] J.A. Selan, “Merging Live Video with Synthetic Imagery,” master's thesis, Cornell Univ., 2003.
[29] Y.-Y. Chuang, D.B. Goldman, B. Curless, D. Salesin, and R. Szeliski, “Shadow Matting and Compositing,” ACM Trans. Graphics, vol. 22, no. 3, pp. 494-500, 2003.
[30] G.D. Finlayson, S.D. Hordley, C. Lu, and M.S. Drew, “On the Removal of Shadows from Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 1, pp. 59-68, Jan. 2006.
[31] F. Moreno-Noguer, P.N. Belhumeur, and S.K. Nayar, “Active Refocusing of Images and Videos,” ACM Trans. Graphics, vol. 26, no. 3, p. 67, 2007.
[32] S.G. Narasimhan and S.K. Nayar, “Interactive (De)Weathering of an Image Using Physical Models,” Proc. IEEE Workshop Color and Photometric Methods in Computer Vision, 2003.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool