The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2011 vol.33)
pp: 721-740
Pedro F. Felzenszwalb , University of Chicago, Chicago
Ramin Zabih , Cornell University, Ithaca
ABSTRACT
Optimization is a powerful paradigm for expressing and solving problems in a wide range of areas, and has been successfully applied to many vision problems. Discrete optimization techniques are especially interesting since, by carefully exploiting problem structure, they often provide nontrivial guarantees concerning solution quality. In this paper, we review dynamic programming and graph algorithms, and discuss representative examples of how these discrete optimization techniques have been applied to some classical vision problems. We focus on the low-level vision problem of stereo, the mid-level problem of interactive object segmentation, and the high-level problem of model-based recognition.
INDEX TERMS
Combinatorial algorithms, vision and scene understanding, artificial intelligence, computing methodologies.
CITATION
Pedro F. Felzenszwalb, Ramin Zabih, "Dynamic Programming and Graph Algorithms in Computer Vision", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 4, pp. 721-740, April 2011, doi:10.1109/TPAMI.2010.135
REFERENCES
[1] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B. Curless, D. Salesin, and M. Cohen, "Interactive Digital Photomontage," ACM Trans. Graphics, vol. 23, no. 3, pp. 292-300, 2004.
[2] R.K. Ahuja, Ö. Ergun, J.B. Orlin, and A.P. Punnen, "A Survey of Very Large-Scale Neighborhood Search Techniques," Discrete Applied Math., vol. 123, nos. 1-3, pp. 75-102, 2002.
[3] A. Amini, T. Weymouth, and R. Jain, "Using Dynamic Programming for Solving Variational Problems in Vision," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 9, pp. 855-867, Sept. 1990.
[4] Y. Amit and A. Kong, "Graphical Templates for Model Registration," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 3, pp. 225-236, Mar. 1996.
[5] H.H. Baker and T.O. Binford, "Depth from Edge and Intensity Based Stereo," Proc. Int'l Joint Conf. Artificial Intelligence, pp. 631-636, 1981.
[6] S. Barnard, "Stochastic Stereo Matching over Scale," Int'l J. Computer Vision, vol. 3, no. 1, pp. 17-32, 1989.
[7] R. Basri, L. Costa, D. Geiger, and D. Jacobs, "Determining the Similarity of Deformable Shapes," Vision Research, vol. 38, pp. 2365-2385, 1998.
[8] R. Bellman, Dynamic Programming. Princeton Univ. Press, 1957.
[9] S. Belongie, J. Malik, and J. Puzicha, "Shape Matching and Object Recognition Using Shape Contexts," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr. 2002.
[10] S. Birchfield and C. Tomasi, "Multiway Cut for Stereo and Motion with Slanted Surfaces," Proc. IEEE Int'l Conf. Computer Vision, pp. 489-495, 1999.
[11] A. Blake and A. Zisserman, Visual Reconstruction. MIT Press, 1987.
[12] E. Boros and P.L. Hammer, "Pseudo-Boolean Optimization," Discrete Applied Math., vol. 123, nos. 1-3, 2002.
[13] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, 2004.
[14] Y. Boykov and G. Funka-Lea, "Graph Cuts and Efficient N-D Image Segmentation," Int'l J. Computer Vision, vol. 70, pp. 109-131, 2006.
[15] Y. Boykov and M.-P. Jolly, "Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images," Proc. IEEE Int'l Conf. Computer Vision, pp. I:105-I:112, 2001.
[16] Y. Boykov and V. Kolmogorov, "An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-1137, Sept. 2004.
[17] Y. Boykov and O. Veksler, "Graph Cuts in Vision and Graphics: Theories and Applications," Math. Models in Computer Vision: The Handbook, N. Paragios, ed., pp. 79-95, Springer, 2005.
[18] Y. Boykov, O. Veksler, and R. Zabih, "Markov Random Fields with Efficient Approximations," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 648-655, 1998.
[19] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph Cuts," Proc. IEEE Int'l Conf. Computer Vision, pp. 377-384, 1999.
[20] Y. Boykov, O. Veksler, and R. Zabih, "Fast Approximate Energy Minimization via Graph Cuts," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov. 2001.
[21] M.Z. Brown, D. Burschka, and G.D. Hager, "Advances in Computational Stereo," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 8, pp. 993-1008, Aug. 2003.
[22] R. Burkard, B. Klinz, and R. Rudolf, "Perspectives of Monge Properties in Optimization," Discrete and Applied Math., vol. 70, no. 2, pp. 95-161, 1996.
[23] W. Cook, W. Cunningham, W. Pulleyblank, and A. Schrijver, Combinatorial Optimization. John Wiley & Sons, 1998.
[24] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms. MIT Press and McGraw-Hill, 1989.
[25] J. Coughlan, A. Yuille, C. English, and D. Snow, "Efficient Deformable Template Detection and Localization without User Initialization," Computer Vision and Image Understanding, vol. 78, no. 3, pp. 303-319, June 2000.
[26] J. Coughlan and A.L. Yuille, "Bayesian A∗ Tree Search with Expected O(N) Node Expansions: Applications to Road Tracking," Neural Computation, vol. 14, no. 8, pp. 1929-1958, 2006.
[27] W.H. Cunningham, "Minimum Cuts, Modular Functions, and Matroid Polyhedra," Networks, vol. 15, pp. 205-215, 1985.
[28] E. Dahlhaus, D. Johnson, C. Papadimitriou, P. Seymour, and M. Yannakakis, "The Complexity of Multiterminal Cuts," SIAM J. Computing, vol. 23, no. 4, pp. 864-894, 1994.
[29] T.J. Darrell, D. Demirdjian, N. Checka, and P.F. Felzenszwalb, "Plan-View Trajectory Estimation with Dense Stereo Background Models," Proc. IEEE Int'l Conf. Computer Vision, 2001.
[30] F. Dellaert, "Monte Carlo EM for Data-Association and Its Applications in Computer Vision," PhD thesis, Carnegie Mellon Univ., Sept. 2001.
[31] E.W. Dijkstra, "A Note on Two Problems in Connection with Graphs," Numerical Math., vol. 1, pp. 269-271, 1959.
[32] P.F. Felzenszwalb, "Representation and Detection of Deformable Shapes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 208-220, Feb. 2005.
[33] P.F. Felzenszwalb and D.P. Huttenlocher, "Distance Transforms of Sampled Functions," Technical Report TR2004-1963, Faculty of Computing and Information Science, Cornell Univ., Sept. 2004.
[34] P.F. Felzenszwalb and D.P. Huttenlocher, "Pictorial Structures for Object Recognition," Int'l J. Computer Vision, vol. 61, no. 1, pp. 55-79, 2005.
[35] P.F. Felzenszwalb and D. McAllester, "The Generalized A∗ Architecture," J. Artificial Intelligence Research, vol. 29, pp. 153-190, 2007.
[36] M.A. Fischler and R.A. Elschlager, "The Representation and Matching of Pictorial Structures," IEEE Trans. Computers, vol. 22, no. 1, pp. 67-92, Jan. 1973.
[37] L. Ford and D. Fulkerson, Flows in Networks. Princeton Univ. Press, 1962.
[38] M. Garey and D. Johnson, Computers and Intractability. W.H. Freeman and Co., 1979.
[39] D. Geiger, A. Gupta, L.A. Costa, and J. Vlontzos, "Dynamic-Programming for Detecting, Tracking, and Matching Deformable Contours," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 3, pp. 294-302, Mar. 1995.
[40] D. Geman and B. Jedynak, "An Active Testing Model for Tracking Roads in Satellite Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 1, pp. 1-14, Jan. 1996.
[41] S. Geman and D. Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721-741, Nov. 1984.
[42] D. Greig, B. Porteous, and A. Seheult, "Exact Maximum a Posteriori Estimation for Binary Images," J. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271-279, 1989.
[43] A. Gupta and E. Tardos, "A Constant Factor Approximation Algorithm for a Class of Classification Problems," Proc. ACM Symp. Theoretical Computer Science, 2000.
[44] P.L. Hammer, P. Hansen, and B. Simeone, "Roof Duality, Complementation and Persistency in Quadratic 0-1 Optimization," Math. Programming, vol. 28, pp. 121-155, 1984.
[45] F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, and W.A. Stahel, Robust Statistics: The Approach Based on Influence Functions. Wiley, 1986.
[46] M.J. Hanna, "Computer Matching of Areas in Stereo Images," PhD thesis, Stanford Univ., 1974.
[47] J. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. Univ. of Michigan Press, 1975.
[48] B.K.P. Horn and B. Schunk, "Determining Optical Flow," Artificial Intelligence, vol. 17, pp. 185-203, 1981.
[49] H. Ishikawa, "Higher-Order Clique Reduction in Binary Graph Cut," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[50] H. Ishikawa and D. Geiger, "Occlusions, Discontinuities, and Epipolar Lines in Stereo," Proc. European Conf. Computer Vision, pp. 232-248, 1998.
[51] H. Ishikawa and D. Geiger, "Segmentation by Grouping Junctions," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 125-131, 1998.
[52] H. Ishikawa, "Exact Optimization for Markov Random Fields with Convex Priors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1333-1336, Oct. 2003.
[53] I.H. Jermyn and H. Ishikawa, "Globally Optimal Regions and Boundaries as Minimum Ratio Weight Cycles," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1075-1088, Oct. 2001.
[54] R. Kannan, S. Vempala, and A. Vetta, "On Clusterings: Good, Bad, and Spectral," J. ACM, vol. 51, no. 3, pp. 497-515, 2004.
[55] R.M. Karp and J. Pearl, "Searching for an Optimal Path in a Tree with Random Costs," Artificial Intelligence, vol. 21, no. 1, pp. 99-116, 1983.
[56] M. Kass, A. Witkin, and D. Terzopoulos, "Snakes: Active Contour Models," Int'l J. Computer Vision, vol. 1, no. 4, pp. 321-331, 1987.
[57] J. Kim, V. Kolmogorov, and R. Zabih, "Visual Correspondence Using Energy Minimization and Mutual Information," Proc. IEEE Int'l Conf. Computer Vision, pp. 1033-1040, 2003.
[58] J. Kim and R. Zabih, "Automatic Segmentation of Contrast-Enhanced Image Sequences," Proc. IEEE Int'l Conf. Computer Vision, pp. 502-509, 2003.
[59] J. Kleinberg and E. Tardos, "Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields," J. ACM, vol. 49, no. 5, pp. 616-639, 2002.
[60] J. Kleinberg and E. Tardos, Algorithm Design. Addison Wesley, 2005.
[61] D. Knuth, "A Generalization of Dijkstra's Algorithm," Information Processing Letters, vol. 6, no. 1, pp. 1-5, Feb. 1977.
[62] P. Kohli, M.P. Kumar, and P.H.S. Torr, "P3 and Beyond: Move Making Algorithms for Solving Higher Order Functions," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 9, pp. 1645-1656, Sept. 2009.
[63] P. Kohli and P. Torr, "Efficiently Solving Dynamic Markov Random Fields Using Graph Cuts," Proc. IEEE Int'l Conf. Computer Vision, 2005.
[64] V. Kolmogorov and R. Zabih, "What Energy Functions Can Be Minimized via Graph Cuts?" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 147-159, Feb. 2004.
[65] V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, and C. Rother, "Probabilistic Fusion of Stereo with Color and Contrast for Bi-Layer Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 9, pp. 1480-1492, Sept. 2006.
[66] V. Kolmogorov and C. Rother, "Minimizing Nonsubmodular Functions with Graph Cuts—A Review," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 7, pp. 1274-1279, July 2007.
[67] N. Komodakis and G. Tziritas, "A New Framework for Approximate Labeling via Graph Cuts," Proc. IEEE Int'l Conf. Computer Vision, 2005.
[68] B. Korte and J. Vygen, Combinatorial Optimization: Theory and Algorithms. Springer, 2005.
[69] A. Krause, A. Singh, and C. Guestrin, "Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies," J. Machine Learning Research, vol. 9, pp. 235-284, 2008.
[70] V. Kwatra, A. Schodl, I. Essa, G. Turk, and A. Bobick, "Graphcut Textures: Image and Video Synthesis Using Graph Cuts," ACM Trans. Graphics, vol. 22, pp. 277-286, 2003.
[71] V. Lempitsky, C. Rother, S. Roth, and A. Blake, "Fusion Moves for Markov Random Field Optimization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 8, pp. 1392-1405, Aug. 2010.
[72] M.H. Lin and C. Tomasi, "Surfaces with Occlusions from Layered Stereo," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 1073-1078, Aug. 2004.
[73] H. Ling and D.W. Jacobs, "Using the Inner-Distance for Classification of Articulated Shapes," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. II:719-II:726, 2005.
[74] M. Maes, "On a Cyclic String-to-String Correction Problem," Information Processing Letters, vol. 35, no. 2, pp. 73-78, June 1990.
[75] D.R. Martin, C.C. Fowlkes, and J. Malik, "Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 530-549, May 2004.
[76] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, "Equations of State Calculations by Fast Computing Machines," J. Chemical Physics, vol. 21, pp. 1087-1091, 1953.
[77] U. Montanari, "On the Optimal Detection of Curves in Noisy Pictures," Comm. ACM, vol. 14, no. 5, pp. 335-345, 1971.
[78] E.N. Mortensen and W.A. Barrett, "Intelligent Scissors for Image Composition," Proc. ACM SIGGRAPH, pp. 191-198, 1995.
[79] Y. Ohta and T. Kanade, "Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 7, no. 2, pp. 139-154, Mar. 1985.
[80] C. Papadimitriou and K. Stieglitz, Combinatorial Optimization: Algorithms and Complexity. Prentice Hall, 1982.
[81] V.I. Pavlovic, R. Sharma, and T.S. Huang, "Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, July 1997.
[82] T. Poggio, V. Torre, and C. Koch, "Computational Vision and Regularization Theory," Nature, vol. 317, pp. 314-319, 1985.
[83] W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes in C. Cambridge Univ. Press, 1992.
[84] L. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
[85] C. Raphael, "Coarse-to-Fine Dynamic Programming," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 12, pp. 1379-1390, Dec. 2001.
[86] C. Raphael and S. Geman, "A Grammatical Approach to Mine Detection," Proc. Conf. SPIE, pp. 316-337, 1997.
[87] C. Rother, V. Kolmogorov, and A. Blake, ""GrabCut"—Interactive Foreground Extraction Using Iterated Graph Cuts," ACM Trans. Graphics, vol. 23, no. 3, pp. 309-314, 2004.
[88] C. Rother, S. Kumar, V. Kolmogorov, and A. Blake, "Digital Tapestry," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[89] S. Roy, "Stereo without Epipolar Lines: A Maximum Flow Formulation," Int'l J. Computer Vision, vol. 1, no. 2, pp. 1-15, 1999.
[90] S. Roy and I. Cox, "A Maximum-Flow Formulation of the N-Camera Stereo Correspondence Problem," Proc. IEEE Int'l Conf. Computer Vision, 1998.
[91] I. Satoru, F. Lisa, and F. Satoru, "A Combinatorial, Strongly Polynomial Algorithm for Minimizing Submodular Functions," J. ACM, vol. 48, no. 4, pp. 761-777, 2001.
[92] D. Scharstein and R. Szeliski, "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms," Int'l J. Computer Vision, vol. 47, pp. 7-42, 2002.
[93] T. Schoenemann and D. Cremers, "Matching Non-Rigidly Deformable Shapes across Images: A Globally Optimal Solution," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-6, 2008.
[94] T.B. Sebastian, P.N. Klein, and B.B. Kimia, "On Aligning Curves," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 1, pp. 116-124, Jan. 2003.
[95] A. Shashua and S. Ullman, "Structural Saliency: The Detection of Globally Salient Structures Using a Locally Connected Network," Proc. IEEE Int'l Conf. Computer Vision, pp. 321-327, 1988.
[96] J. Shi and J. Malik, "Normalized Cuts and Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[97] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, "A Comparative Study of Energy Minimization Methods for Markov Random Fields," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 1068-1080, June 2008.
[98] A. Tikhonov and V. Arsenin, Solutions of Ill-Posed Problems. Winston and Sons, 1977.
[99] O. Veksler, "Stereo Correspondence by Dynamic Programming on a Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 384-390, 2005.
[100] O. Veksler, "Efficient Graph-Based Energy Minimization Methods in Computer Vision," PhD thesis, Cornell Univ., 1999.
[101] M. Wainwright, T. Jaakkola, and A. Willsky, "Map Estimation via Agreement on Trees: Message-Passing and Linear Programming," IEEE Trans. Information Theory, vol. 5, no. 11, pp. 3697-3717, Nov. 2005.
[102] Y. Weiss, C. Yanover, and T. Meltzer, "Map Estimation, Linear Programming and Belief Propagation with Convex Free Energies," Proc. Conf. Uncertainty in Artificial Intelligence, 2007.
[103] O. Woodford, P. Torr, I. Reid, and A. Fitzgibbon, "Global Stereo Reconstruction Under Second-Order Smoothness Priors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2115-2128, Dec. 2009.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool