The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2011 vol.33)
pp: 978-994
Ce Liu , Microsoft Research New England, Cambridge
Jenny Yuen , Massachusetts Institute of Technology, Cambridge
Antonio Torralba , Massachusetts Institute of Technology, Cambridge
ABSTRACT
While image alignment has been studied in different areas of computer vision for decades, aligning images depicting different scenes remains a challenging problem. Analogous to optical flow, where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image to its nearest neighbors in a large image corpus containing a variety of scenes. The SIFT flow algorithm consists of matching densely sampled, pixelwise SIFT features between two images while preserving spatial discontinuities. The SIFT features allow robust matching across different scene/object appearances, whereas the discontinuity-preserving spatial model allows matching of objects located at different parts of the scene. Experiments show that the proposed approach robustly aligns complex scene pairs containing significant spatial differences. Based on SIFT flow, we propose an alignment-based large database framework for image analysis and synthesis, where image information is transferred from the nearest neighbors to a query image according to the dense scene correspondence. This framework is demonstrated through concrete applications such as motion field prediction from a single image, motion synthesis via object transfer, satellite image registration, and face recognition.
INDEX TERMS
Scene alignment, dense scene correspondence, SIFT flow, coarse to fine, belief propagation, alignment-based large database framework, satellite image registration, face recognition, motion prediction for a single image, motion synthesis via object transfer.
CITATION
Ce Liu, Jenny Yuen, Antonio Torralba, "SIFT Flow: Dense Correspondence across Scenes and Its Applications", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 5, pp. 978-994, May 2011, doi:10.1109/TPAMI.2010.147
REFERENCES
[1] S. Avidan, “Ensemble Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 2, pp. 261-271, Feb. 2007.
[2] S. Baker, D. Scharstein, J.P. Lewis, S. Roth, M.J. Black, and R. Szeliski, “A Database and Evaluation Methodology for Optical Flow,” Proc. IEEE Int'l Conf. Computer Vision, 2007.
[3] J.L. Barron, D.J. Fleet, and S.S. Beauchemin, “Systems and Experiment Performance of Optical Flow Techniques,” Int'l J. Computer Vision, vol. 12, no. 1, pp. 43-77, 1994.
[4] S. Belongie, J. Malik, and J. Puzicha, “Shape Context: A New Descriptor for Shape Matching and Object Recognition,” Proc. Conf. Advances in Neural Information Processing Systems, 2000.
[5] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr. 2002.
[6] A. Berg, T. Berg, and J. Malik, “Shape Matching and Object Recognition Using Low Distortion Correspondence,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[7] J.R. Bergen, P. Anandan, K.J Hanna, and R. Hingorani, “Hierarchical Model-Based Motion Estimation,” Proc. European Conf. Computer Vision, pp. 237-252, 1992.
[8] M.J. Black and P. Anandan, “The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields,” Computer Vision and Image Understanding, vol. 63, no. 1, pp. 75-104, Jan. 1996.
[9] Y. Boykov, O. Veksler, and R. Zabih, “Fast Approximate Energy Minimization via Graph Cuts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001.
[10] T. Brox, C. Bregler, and J. Malik, “Large Displacement Optical Flow,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[11] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, “High Accuracy Optical Flow Estimation Based on a Theory for Warping,” Proc. European Conf. Computer Vision, pp. 25-36, 2004.
[12] A. Bruhn, J. Weickert, and C. Schnörr, “Lucas/Kanade Meets Horn/Schunk: Combining Local and Global Optical Flow Methods,” Int'l J. Computer Vision, vol. 61, no. 3, pp. 211-231, 2005.
[13] D. Cai, X. He, Y. Hu, J. Han, and T. Huang, “Learning a Spatially Smooth Subspace for Face Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[14] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Color- and Texture-Based Image Segmentation Using EM and Its Application to Image Querying and Classification,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002.
[15] T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active Appearance Models,” Proc. European Conf. Computer Vision, vol. 2, pp. 484-498, 1998.
[16] N. Cornelis and L.V. Gool, “Real-Time Connectivity Constrained Depth Map Computation Using Programmable Graphics Hardware,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1099-1104, 2005.
[17] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[18] L. Fei-Fei and P. Perona, “A Bayesian Hierarchical Model for Learning Natural Scene Categories,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 524-531, 2005.
[19] P. Felzenszwalb and D. Huttenlocher, “Pictorial Structures for Object Recognition,” Int'l J. Computer Vision, vol. 61, no. 1, pp. 55-79, 2005.
[20] P.F. Felzenszwalb and D.P. Huttenlocher, “Efficient Belief Propagation for Early Vision,” Int'l J. Computer Vision, vol. 70, no. 1, pp. 41-54, 2006.
[21] D.J. Fleet, A.D. Jepson, and M.R.M. Jenkin, “Phase-Based Disparity Measurement,” Computer Vision, Graphics, and Image Processing, vol. 53, no. 2, pp. 198-210, 1991.
[22] W.T. Freeman, E.C. Pasztor, and O.T. Carmichael, “Learning Low-Level Vision,” Int'l J. Computer Vision, vol. 40, no. 1, pp. 25-47, 2000.
[23] M.M. Gorkani and R.W. Picard, “Texture Orientation for Sorting Photos at a Glance,” Proc. IEEE Int'l Conf. Pattern Recognition, vol. 1, pp. 459-464, 1994.
[24] K. Grauman and T. Darrell, “Pyramid Match Kernels: Discriminative Classification with Sets of Image Features,” Proc. IEEE Int'l Conf. Computer Vision, 2005.
[25] W.E.L. Grimson, “Computational Experiments with a Feature Based Stereo Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 7, no. 1, pp. 17-34, Jan. 1985.
[26] M.J. Hannah, “Computer Matching of Areas in Stereo Images,” PhD thesis, Stanford Univ., 1974.
[27] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[28] J. Hays and A.A. Efros, “Scene Completion Using Millions of Photographs,” Proc. ACM SIGGRAPH, vol. 26, no. 3, 2007.
[29] B.K.P. Horn and B.G. Schunck, “Determing Optical Flow,” Artificial Intelligence, vol. 17, pp. 185-203, 1981.
[30] D.G. Jones and J. Malik, “A Computational Framework for Determining Stereo Correspondence from a Set of Linear Spatial Filters,” Proc. European Conf. Computer Vision, pp. 395-410, 1992.
[31] V. Kolmogorov and R. Zabih, “Computing Visual Correspondence with Occlusions Using Graph Cuts,” Proc. IEEE Int'l Conf. Computer Vision, pp. 508-515, 2001.
[32] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. II, pp. 2169-2178, 2006.
[33] C. Liu, W.T. Freeman, and E.H. Adelson, “Analysis of Contour Motions,” Proc. Conf. Advances in Neural Information Processing Systems, 2006.
[34] C. Liu, W.T. Freeman, E.H. Adelson, and Y. Weiss, “Human-Assisted Motion Annotation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[35] C. Liu, J. Yuen, and A. Torralba, “Nonparametric Scene Parsing: Label Transfer via Dense Scene Alignment,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[36] C. Liu, J. Yuen, A. Torralba, J. Sivic, and W.T. Freeman, “SIFT Flow: Dense Correspondence across Different Scenes,” Proc. European Conf. Computer Vision, 2008.
[37] D.G. Lowe, “Object Recognition from Local Scale-Invariant Features,” Proc. IEEE Int'l Conf. Computer Vision, pp. 1150-1157, 1999.
[38] B. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 674-679, 1981.
[39] A. Oliva and A. Torralba, “Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope,” Int'l J. Computer Vision, vol. 42, no. 3, pp. 145-175, 2001.
[40] P. Pérez, M. Gangnet, and A. Blake, “Poisson Image Editing,” ACM Trans. Graphics, vol. 22, no. 3, pp. 313-318, 2003.
[41] C. Rother, T. Minka, A. Blake, and V. Kolmogorov, “Cosegmentation of Image Pairs by Histogram Matching—Incorporating a Global Constraint into MRFs,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 993-1000, 2006.
[42] B.C. Russell, A. Torralba, C. Liu, R. Fergus, and W.T. Freeman, “Object Recognition by Scene Alignment,” Proc. Conf. Advances in Neural Information Processing Systems, 2007.
[43] B.C. Russell, A. Torralba, K.P. Murphy, and W.T. Freeman, “LabelMe: A Database and Web-Based Tool for Image Annotation,” Int'l J. Computer Vision, vol. 77, nos. 1-3, pp. 157-173, 2008.
[44] F. Samaria and A. Harter, “Parameterization of a Stochastic Model for Human Face Identification,” Proc. IEEE Workshop Applications of Computer Vision, 1994.
[45] D. Scharstein and R. Szeliski, “A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms,” Int'l J. Computer Vision, vol. 47, no. 1, pp. 7-42, 2002.
[46] C. Schmid, R. Mohr, and C. Bauckhage, “Evaluation of Interest Point Detectors,” Int'l J. Computer Vision, vol. 37, no. 2, pp. 151-172, 2000.
[47] A. Shekhovtsov, I. Kovtun, and V. Hlavac, “Efficient MRF Deformation Model for Non-Rigid Image Matching,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[48] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. IEEE Int'l Conf. Computer Vision, 2003.
[49] J. Sun, N. Zheng, and H. Shum, “Stereo Matching Using Belief Propagation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 787-800, July 2003.
[50] M.J. Swain and D.H. Ballard, “Color Indexing,” Int'l J. Computer Vision, vol. 7, no. 1, 1991.
[51] R. Szeliski, “Image Alignment and Stitching: A Tutorial,” Foundations and Trends in Computer Graphics and Computer Vision, vol. 2, no. 1, 2006.
[52] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, “A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 1068-1080, June 2008.
[53] A. Torralba, R. Fergus, and W.T. Freeman, “80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1958-1970, Nov. 2008.
[54] P. Viola and W. Wells,III, “Alignment by Maximization of Mutual Information,” Proc. IEEE Int'l Conf. Computer Vision, pp. 16-23, 1995.
[55] Y. Weiss, “Interpreting Images by Propagating Bayesian Beliefs,” Proc. Conf. Advances in Neural Information Processing Systems, pp. 908-915, 1997.
[56] Y. Weiss, “Smoothness in Layers: Motion Segmentation Using Nonparametric Mixture Estimation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 520-527, 1997.
[57] J. Winn and N. Jojic, “Locus: Learning Object Classes with Unsupervised Segmentation,” Proc. IEEE Int'l Conf. Computer Vision, pp. 756-763, 2005.
[58] G. Yang, C.V. Stewart, M. Sofka, and C.-L. Tsai, “Registration of Challenging Image Pairs: Initialization, Estimation, and Decision,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 11, pp. 1973-1989, Nov. 2007.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool