This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Algebraic Functions For Recognition
August 1995 (vol. 17 no. 8)
pp. 779-789

Abstract—In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment—yielding a direct reprojection method that cuts through the computations of camera transformation, scene structure and epipolar geometry. Moreover, the direct method is linear and sets a new lower theoretical bound on the minimal number of points that are required for a linear solution for the task of reprojection. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in reprojection tasks.

[1] E.H. Adelson,“Layered representations for image coding,” Technical Report 181, Media Laboratory, Massachusetts Inst. of Tech nology, 1991.
[2] E.H. Adelson and J.Y.A. Wang,“Layered representation for motion analysis,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 361-366, New York, June 1993.
[3] G. Adiv, “Inherent Ambiguities in Recovering 3-D Motion and Structure from a Noisy Flow Field,” Trans. Pattern Analysis and Machine Intelligence, vol. 11, pp. 477–489, 1989.
[4] P. Anandan,“A unified perspective on computational techniques for the measurement of visual motion,” Proc. Image Understanding Workshop, pp. 219-230,Los Angeles, Feb. 1987.
[5] I.A. Bachelder and S. Ullman,“Contour matching using local affine transformations,” Proc. Image Understanding Workshop.San Mateo, Calif.: Morgan Kaufmann, 1992.
[6] E.B. Barrett,M.H. Brill,N.N. Haag,, and P.M. Payton,“Invariant linear methods in photogrammetry and model-matching,” J.L. Mundy and Zisserman, eds., Applications of Invariances in Computer Vision. MIT Press, 1992.
[7] J.R. Bergen and R. Hingorani,“Hierarchical motion-based frame rate conversion,” technical report, David Sarnoff Research Center, 1990.
[8] S. Demey,A. Zisserman,, and P. Beardsley,“Affine and projective structure from motion,” Proc. British Machine Vision Conf., Oct. 1992.
[9] R. Dutta and M.A. Synder,“Robustness of correspondence based structure from motion,” Proc. Int’l Conf. Computer Vision, pp. 106-110,Osaka, Japan, Dec. 1990.
[10] O. Faugeras, "What can be seen in three dimensions with an uncalibrated stereo rig?" Second European Conf. Computer Vision, pp. 563-578, 1992.
[11] O.D. Faugeras and L. Robert,“What can two images tell us about a third one?” technical report INRIA, France, 1993.
[12] W.E.L. Grimson,“Why stereo vision is not always about 3D reconstruction,” A.I. Memo No. 1435, Artificial Intelligence Laboratory, Massachusetts Inst. of Tech nology, July 1993.
[13] R. Hartley, R. Gupta, and T. Chang, “Stereo from Uncalibrated Cameras,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 761-764, June 1992.
[14] B.K.P. Horn, “Relative Orientation,” Int'l J. Computer Vision, vol. 4, pp. 59-78, 1990.
[15] B.K.P. Horn,“Relative orientation revisited,” J. Optical Society of America, vol. 8, pp. 1,630-1,638, 1991.
[16] D.P. Huttenlocher and S. Ullman, “Recognizing Solid Objects by Alignment with an Image,” Int'l J. Computer Vision, vol. 5, no. 2, pp. 195-212, 1990.
[17] J.J. Koenderink and A.J. Van Doorn,“Affine structure from motion,” J. Optical Society of America, vol. 8, pp. 377-385, 1991.
[18] H.C. Longuet-Higgins,“A computer algorithm for reconstructing a scene from two projections,” Nature, vol. 293, pp. 133-135, 1981.
[19] H.C. Longuet-Higgins,“The reconstruction of a scene from two projections-Configurations that defeat the 8-point algorithm,” Proc. First Conf. AI Applications, pp. 395-397,Denver, Dec. 1984.
[20] Q.T. Luong,R. Deriche,O.D. Faugeras,, and T. Papadopoulo,“On determining the fundamental matrix: Analysis of different methods and experimental results,” technical report INRIA, France, 1993.
[21] Q.T. Luong and T. Vieville,“Canonical representations for the geometries of multiple projective views,” technical report INRIA, France, 1993.
[22] S.J. Maybank,“The projective geometry of ambiguous surfaces,” Proc. Royal Society of London, vol. 332, pp. 1-47, 1990.
[23] J. Mundy and A. Zisserman,“Appendix_Projective geometry for machine vision,” J. Mundy and A. Zisserman, eds., Geometric Invariances in Computer Vision.Cambridge, Mass.: MIT Press, 1992.
[24] J.L. Mundy,R.P. Welty,M.H. Brill,P.M. Payton,, and E.B. Barrett,“3D model alignment without computing pose,” Proc. Image Understanding Workshop, pp. 727-735.San Mateo, Calif.: Morgan Kaufmann, Jan. 1992.
[25] A. Shashua,“Correspondence and affine shape from two orthographic views: Motion and Recognition,” A.I. Memo No. 1327, Artificial Intelligence Laboratory, Massachusetts Inst. of Tech nology, Dec. 1991.
[26] A. Shashua,“Geometry and photometry in 3D visual recognition,” PhD thesis, M.I.T Artificial Intelligence Laboratory, AI-TR-1401, Nov. 1992.
[27] A. Shashua,“Illumination and view position in 3D visual recognition,” S.J. Hanson J.E. Moody and R.P. Lippmann, eds., Advances in Neural Information Processing Systems 4, pp. 404-411.San Mateo, Calif.: Morgan Kaufmann Publishers, 1992. Proc. Fourth Annual Conf. NIPS, Denver, Dec. 1991.
[28] A. Shashua, "On geometric and algebraic aspects of 3D affine and projective structures from perspective 2D views," Proc. Second Workshop on Applications of Invariance in Computer Vision, pp. 87-112, 1993.
[29] A. Shashua,“Projective depth: A geometric invariant for 3D reconstruction from two perspective/orthographic views and for visual recognition,” Proc. Int’l Conf. Computer Vision, pp. 583-590,Berlin, May 1993.
[30] A. Shashua, "Projective Structure From Uncalibrated Images: Structure-From-Motion and Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 8, pp. 778-790, Aug. 1994.
[31] A. Shashua, “Trilinearity in Visual Recognition by Alignment” Proc. Third European Conf. Computer Vision, J.O. Eklundh, ed., pp. 479-484, May 1994.
[32] A. Shashua and N. Navab, "Relative Affine Structure: Theory and Application to 3D Reconstruction From Perspective Views," Proc. CVPR '94, pp. 483-489, 1994.
[33] A. Shashua and S. Toelg,“The quadric reference surface: Applications in registering views of complex 3D objects,” Proc. European Conf. Computer Vision,Stockholm, Sweden, May 1994.
[34] C. Tomasi and T. Kanade, "Factoring Image Sequences Into Shape and Motion," IEEE Workshop Visual Motion, pp. 21-28,Princeton, N.J., Oct. 1991.
[35] R.Y. Tsai and T.S. Huang,“Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surface,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 13-26, 1984.
[36] S. Ullman., The Interpretation of Visual Motion.Cambridge, Mass., and London: MIT Press, 1979.
[37] S. Ullman,“Aligning pictorial descriptions: An approach to object recognition,” Cognition, vol. 32, pp. 193-254, 1989. Also: in MIT AI Memo 931, Dec. 1986.
[38] S. Ullman and R. Basri, "Recognition by Linear Combinations of Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 992-1006, 1991.
[39] D. Weinshall,“Model based invariants for 3D vision,” Int’l J. Computer Vision, vol. 10, no. 1, pp. 27-42, 1993.

Index Terms:
Visual recognition, alignment, reprojection, projective geometry, algebraic and geometric invariants.
Citation:
Amnon Shashua, "Algebraic Functions For Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 779-789, Aug. 1995, doi:10.1109/34.400567
Usage of this product signifies your acceptance of the Terms of Use.