This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fast Approximate Energy Minimization via Graph Cuts
November 2001 (vol. 23 no. 11)
pp. 1222-1239

Abstract—Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. In this paper, we consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

[1] K. Ahuja, T.L. Magnati, and J.B. Orlin, Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.
[2] A.A. Amini,T.E. Weymouth,, and R.C. Jain,“Using dynamic programming for solving variational problems in vision,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 9, pp. 855-867, 1990.
[3] S.A. Barker and P.J.W. Rayner, “Unsupervised Image Segmentation,” Proc. IEEE Int'l Conf. Acoustics, Speech and Signal Processing, vol. 5, pp. 2757-2760, 1998.
[4] J. Besag, “On the Statistical Analysis of Dirty Pictures,” (with discussion), J. Royal Statistical Soc., Series B, vol. 48, no. 3, pp. 259-302, 1986.
[5] S. Birchfield and C. Tomasi, “Depth Discontinuities by Pixel-to-Pixel Stereo,” Int'l J. Computer Vision, vol. 35, no. 3, pp. 1-25, Dec. 1999.
[6] S. Birchfield and C. Tomasi, “A Pixel Dissimilarity Measure that Is Insensitive to Image Sampling,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 4, pp. 401-406, Apr. 1998.
[7] S. Birchfield, “Depth and Motion Discontinuities,” PhD thesis, Stanford Univ., June 1999. Available from.
[8] A. Blake and A. Zisserman, Visual Reconstruction. MIT Press, 1987.
[9] A. Blake, “Comparison of the Efficiency of Deterministic and Stochastic Algorithms for Visual Reconstruction,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 1, pp. 2-12, Jan. 1989.
[10] Y. Boykov and V. Kolmogorov, “An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision,” Proc. Int'l Workshop Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 359-374, Sept. 2001.
[11] Y. Boykov, O. Veksler, and R. Zabih, Markov Random Fields with Efficient Approximations Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 648-655, 1998.
[12] P.B. Chou and C.M. Brown, “The Theory and Practice of Bayesian Image Labeling,” Int'l J. Computer Vision, vol. 4, no. 3, pp. 185-210, 1990.
[13] E. Dahlhaus, D.S. Johnson, C.H. Papadimitriou, P.D. Seymour, and M. Yannakakis, “The Complexity of Multiway Cuts,” ACM Symp. Theory of Computing, pp. 241-251, 1992.
[14] P. Ferrari, A. Frigessi, and P. de Sá, “Fast Approximate Maximum A Posteriori Restoration of Multicolour Images,” J. Royal Statistical Soc., Series B, vol. 57, no. 3, pp. 485-500, 1995.
[15] L. Ford and D. Fulkerson, Flows in Networks. Princeton Univ. Press, 1962.
[16] Y. Gdalyahu, D. Weinshall, and M. Werman., “Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1053-1074, Oct. 2001.
[17] D. Geiger and A. Yuille, "A Common Framework for Image Segmentation," Int'l J. Computer Vision, vol. 6, pp. 227-243, 1991.
[18] D. Geman,S. Geman,C. Graffigne,, and P. Dong,“Boundary detection by constrained optimization,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 609-628, July 1990.
[19] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, 1984.
[20] D. Greig, B. Porteous, and A. Seheult, “Exact Maximum A Posteriori Estimation for Binary Images,” J. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271-279, 1989.
[21] W.E.L. Grimson and T. Pavlidis, “Discontinuity Detection for Visual Surface Reconstruction,” Computer Vision, Graphics and Image Processing, vol. 30, pp. 316-330, 1985.
[22] B.K.P. Horn and B. Schunk, “Determining Optical Flow,” Artificial Intelligence, vol. 17, pp. 185-203, 1981.
[23] R. Hummel and S. Zucker, “On the Foundations of Relaxation Labeling Processes,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, pp. 267-287, 1983.
[24] H. Ishikawa and D. Geiger, “Occlusions, Discontinuities, and Epipolar Lines in Stereo,” Proc. European Conf. Computer Vision, pp. 232-248, 1998.
[25] H. Ishikawa, “Global Optimization Using Embedded Graphs,” PhD thesis, New York Univ., May 2000. Available from.
[26] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active Contour Models,” Int'l J. Computer Vision, vol. 1, no. 4, pp. 321-331, 1987.
[27] J. Kleinberg and E. Tardos, “Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields,” Proc. IEEE Symp. Foundations of Computer Science, pp. 14-24, 1999.
[28] V. Kolmogorov and R. Zabih, “Computing Visual Correspondence with Occlusions via Graph Cuts,” Proc. Int'l Conf. Computer Vision, vol. II, pp. 508-515, 2001.
[29] D. Lee and T. Pavlidis,“One-dimensional regularization with discontinuities,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 10, pp. 822-829, 1986.
[30] S.Z. Li, Markov Random Field Modeling in Computer Vision. New York: Springer-Verlag, 1995.
[31] G. Parisi, Statistical Field Theory. Reading, Mass.: Addison-Wesley, 1988.
[32] M. Pelillo, “The Dynamics of Nonlinear Relaxation Labeling Processes,” J. Math. Imaging and Vision, vol. 7, pp. 309-323, 1997.
[33] T. Poggio, E. Gamble, and J. Little, “Parallel Integration of Vision Modules,” Science, vol. 242, pp. 436-440, Oct. 1988. See also E. Gamble and T. Poggio, MIT AI Memo 970.
[34] T. Poggio, V. Torre, and C. Koch, “Computational Vision and Regularization Theory,” Nature, vol. 317, pp. 314-319, 1985.
[35] R. Potts, “Some Generalized Order-Disorder Transformation,” Proc. Cambridge Philosophical Soc., vol. 48, pp. 106-109, 1952.
[36] A. Rosenfeld, R.A. Hummel, and S.W. Zucker, “Scene Labeling by Relaxation Operations,” IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 420-433, June 1976.
[37] S. Roy and I.J. Cox, "A Maximum-Flow Formulation of the N-Camera Stereo Correspondence Problem," Proc. Int'l Conf. Computer Vision, pp. 492-499,Bombay, Jan. 1998.
[38] J. Shi and J. Malik, Normalized Cuts and Image Segmentation IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[39] R.H. Swendson and J. Wang, “Nonuniversal Critical Dynamics in Monte Carlo Simulations,” Physical Rev. Letters, vol. 58, no. 2, pp. 86-88, 1987.
[40] R. Szeliski and R. Zabih, “An Experimental Comparison of Stereo Algorithms,” Proc. Int'l Workshop Vision Algorithms, pp. 1-19, 1999.
[41] R. Szeliski,“Bayesian modeling of uncertainty in low-level vision,” Int’l J. Computer Vision, vol. 5, no. 3, pp. 271-301, Dec. 1990.
[42] D. Terzopoulos, "Regularization of Inverse Visual Problems Involving Discontinuities," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 4, pp. 413-424, 1986.
[43] O. Veksler, “Efficient Graph-Based Energy Minimization Methods in Computer Vision,” PhD thesis, Cornell Univ., July 1999. Available from.
[44] O. Veksler, Image Segmentation by Nested Cuts Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 339-344, 2000.
[45] Y. Weiss and E. Adelson, “A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models,” Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition, pp. 321-326, 1996.
[46] G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods. Springer-Verlag, 1995.
[47] Z. Wu and R. Leahy, “An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1,101-1,113, Nov. 1993.

Index Terms:
Energy minimization, early vision, graph algorithms, minimum cut, maximum flow, stereo, motion, image restoration, Markov Random Fields, Potts model, multiway cut.
Citation:
Yuri Boykov, Olga Veksler, Ramin Zabih, "Fast Approximate Energy Minimization via Graph Cuts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001, doi:10.1109/34.969114
Usage of this product signifies your acceptance of the Terms of Use.