Subscribe
Issue No.12 - Dec. (2013 vol.35)
pp: 2841-2853
David J. Crandall , Dept. of Comput. Sci., Cornell Univ., Ithaca, NY, USA
Andrew Owens , Comput. Sci. & Artificial Intell. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA
Noah Snavely , Sch. of Inf. & Comput., Indiana Univ., Bloomington, IN, USA
Daniel P. Huttenlocher , Sch. of Inf. & Comput., Indiana Univ., Bloomington, IN, USA
ABSTRACT
Recent work in structure from motion (SfM) has built 3D models from large collections of images downloaded from the Internet. Many approaches to this problem use incremental algorithms that solve progressively larger bundle adjustment problems. These incremental techniques scale poorly as the image collection grows, and can suffer from drift or local minima. We present an alternative framework for SfM based on finding a coarse initial solution using hybrid discrete-continuous optimization and then improving that solution using bundle adjustment. The initial optimization step uses a discrete Markov random field (MRF) formulation, coupled with a continuous Levenberg-Marquardt refinement. The formulation naturally incorporates various sources of information about both the cameras and points, including noisy geotags and vanishing point (VP) estimates. We test our method on several large-scale photo collections, including one with measured camera positions, and show that it produces models that are similar to or better than those produced by incremental bundle adjustment, but more robustly and in a fraction of the time.
INDEX TERMS
Cameras, Optimization, Robustness, Image reconstruction, Noise measurement, Belief propagation, Motion analysis,belief propagation, Structure from motion, 3D reconstruction, Markov random fields
CITATION
David J. Crandall, Andrew Owens, Noah Snavely, Daniel P. Huttenlocher, "SfM with MRFs: Discrete-Continuous Optimization for Large-Scale Structure from Motion", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 12, pp. 2841-2853, Dec. 2013, doi:10.1109/TPAMI.2012.218
REFERENCES
 [1] D. Crandall, A. Owens, N. Snavely, and D. Huttenlocher, "Discrete Continuous Optimization for Large-Scale Structure from Motion," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011. [2] S. Agarwal, N. Snavely, I. Simon, S. Seitz, and R. Szeliski, "Building Rome in a Day," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009. [3] J.-M. Frahm, P. Georgel, D. Gallup, T. Johnson, R. Raguram, C. Wu, Y.-H. Jen, E. Dunn, B. Clipp, and S. Lazebnik, "Building Rome on a Cloudless Day," Proc. 11th European Conf. Computer Vision, 2010. [4] X. Li, C. Wu, C. Zach, S. Lazebnik, and J.-M. Frahm, "Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs," Proc. 10th European Conf. Computer Vision, pp. 427-440, 2008. [5] N. Snavely, S. Seitz, and R. Szeliski, "Photo Tourism: Exploring Photo Collections in 3D," ACM Trans. Graphics, vol. 25, no. 3, pp. 835-846, 2006. [6] B. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon, "Bundle Adjustment: A Modern Synthesis," Vision Algorithms: Theory and Practice, Springer, 2000. [7] C. Tomasi and T. Kanade, "Shape and Motion from Image Streams under Orthography: A Factorization Method," Int'l J. Computer Vision, vol. 9, no. 2, pp. 137-154, 1992. [8] F. Schaffalitzky and A. Zisserman, "Multi-View Matching for Unordered Image Sets, or 'How Do I Organize My Holiday Snaps?'" Proc. Seventh European Conf. Computer Vision, pp. 414-431, 2002. [9] M. Byrod and K. Astrom, "Conjugate Gradient Bundle Adjustment," Proc. 11th European Conf. Computer Vision, pp. 114-127, 2010. [10] S. Agarwal, N. Snavely, S. Seitz, and R. Szeliski, "Bundle Adjustment in the Large," Proc. 11th European Conf. Computer Vision, 2010. [11] F. Bajramovic and J. Denzler, "Global Uncertainty-Based Selection of Relative Poses for Multi Camera Calibration," Proc. British Machine Vision Conf., 2008. [12] N. Snavely, S. Seitz, and R. Szeliski, "Skeletal Graphs for Efficient Structure from Motion," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008. [13] J. Vergés-Llahí, D. Moldovan, and T. Wada, "A New Reliability Measure for Essential Matrices Suitable in Multiple View Calibration," Proc. Int'l Conf. Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 114-121, 2008. [14] V.M. Govindu, "Combining Two-View Constraints for Motion Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 218-225, 2001. [15] C. Rother, "Linear Multi-View Reconstruction of Points, Lines, Planes and Cameras Using a Reference Plane," Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 1210-1217, 2003. [16] D. Martinec and T. Pajdla, "Robust Rotation and Translation Estimation in Multiview Reconstruction," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007. [17] K. Sim and R. Hartley, "Recovering Camera Motion Using ${\rm l}\infty$ Minimization," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1230-1237, 2006. [18] F. Kahl and R. Hartley, "Multiple-View Geometry under the l-Infinity-Norm," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp. 1603-1617, Sept. 2008. [19] S. Sinha, D. Steedly, and R. Szeliski, "A Multi-Stage Linear Approach to Structure from Motion," Proc. 11th European Conf. Computer Vision, 2010. [20] P. Lothe, S. Bourgeois, F. Dekeyser, E. Royer, and M. Dhome, "Towards Geographical Referencing of Monocular SLAM Reconstruction Using 3D City Models: Application to Real-Time Accurate Vision-Based Localization," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009. [21] S. Sinha, D. Steedly, R. Szeliski, M. Agrawala, and M. Pollefeys, "Interactive 3D Architectural Modeling from Unordered Photo Collections," ACM Trans. Graphics, vol. 27, no. 5, p. 159, 2008. [22] R. Kaminsky, N. Snavely, S. Seitz, and R. Szeliski, "Alignment of 3D Point Clouds to Overhead Images," Proc. IEEE Workshop Internet Vision, 2009. [23] C. Strecha, T. Pylvänäinen, and P. Fua, "Dynamic and Scalable Large Scale Image Reconstruction," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010. [24] F. Dellaert and M. Kaess, "Square Root SAM: Simultaneous Localization and Mapping via Square Root Information Smoothing," Int'l J. Robotic Research, vol. 25, no. 12, pp. 1181-1203, 2006. [25] A. Ranganathan, M. Kaess, and F. Dellaert, "Loopy SAM," Proc. Int'l Joint Conf. Artificial Intelligence, pp. 2191-2196, 2007. [26] R. Tron and R. Vidal, "Distributed Image-Based 3-D Localization of Camera Sensor Networks," Proc. IEEE Conf. Decision and Control, 2009. [27] D. Devarajan and R.J. Radke, "Calibrating Distributed Camera Networks Using Belief Propagation," EURASIP J. Applied Signal Processing, vol. 2007, pp. 221-221, 2007. [28] A. Ihler, J. Fisher, R. Moses, and A. Willsky, "Nonparametric Belief Propagation for Self-Localization of Sensor Networks," IEEE J. Selected Areas. Comm., vol. 23, no. 4, pp. 809-819, Sept. 2006. [29] K. Ni, D. Steedly, and F. Dellaert, "Out-of-Core Bundle Adjustment for Large-Scale 3D Reconstruction," Proc. 11th IEEE Int'l Conf. Computer Vision, 2007. [30] R. Gherardi, M. Farenzena, and A. Fusiello, "Improving the Efficiency of Hierarchical Structure-and-Motion," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1594-1600, 2010. [31] V. Lempitsky, S. Roth, and C. Rother, "FusionFlow: Discrete-Continuous Optimization for Optical Flow Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008. [32] D. Nistér, "An Efficient Solution to the Five-Point Relative Pose Problem," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 6, pp. 756-777, June 2004. [33] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph Cuts," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001. [34] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988. [35] J. Nocedal and S.J. Wright, Numerical Optimization, second ed. Springer, 2006. [36] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, Nov. 2004. [37] D. Nistér and H. Stewénius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2161-2168, 2006. [38] S. Arya and D. Mount, "Approximate Nearest Neighbor Queries in Fixed Dimensions," Proc. Fourth Ann. ACM-SIAM Symp. Discrete Algorithms, 1993. [39] P. Felzenszwalb and D. Huttenlocher, "Efficient Belief Propagation for Early Vision," Int'l J. Computer Vision, vol. 70, no. 1, pp. 41-54, 2006. [40] D.M. Chen, G. Baatz, K. Köser, S.S. Tsai, R. Vedantham, T. Pylvänäinen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk, "City-Scale Landmark Identification on Mobile Devices," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011. [41] Y. Furukawa and J. Ponce, "Accurate, Dense, and Robust Multiview Steropsis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 8, pp. 1362-1376, Aug. 2010.