This Article 
 Bibliographic References 
 Add to: 
Distance Metric Between 3D Models and 2D Images for Recognition and Classification
April 1996 (vol. 18 no. 4)
pp. 465-470

Abstract—Similarity measurements between 3D objects and 2D images are useful for the tasks of object recognition and classification. We distinguish between two types of similarity metrics: metrics computed in image-space (image metrics) and metrics computed in transformation-space (transformation metrics). Existing methods typically use image metrics; namely, metrics that measure the difference in the image between the observed image and the nearest view of the object. Example for such a measure is the Euclidean distance between feature points in the image and their corresponding points in the nearest view. (This measure can be computed by solving the exterior orientation calibration problem.) In this paper we introduce a different type of metrics: transformation metrics. These metrics penalize for the deformations applied to the object to produce the observed image.

In particular, we define a transformation metric that optimally penalizes for "affine deformations" under weak-perspective. A closed-form solution, together with the nearest view according to this metric, are derived. The metric is shown to be equivalent to the Euclidean image metric, in the sense that they bound each other from both above and below. It therefore provides an easy-to-use closed-form approximation for the commonly-used least-squares distance between models and images. We demonstrate an image understanding application, where the true dimensions of a photographed battery charger are estimated by minimizing the transformation metric.

[1] R. Basri, "Recognition by Prototypes," Computer Vision and Pattern Recognition (CVPR-93),New York, 1993.
[2] R. Basri and D. Weinshall, "Distance Metric Between 3D Models and 2D Images for Recognition and Classification," MIT AI Memo #1373, 1992.
[3] D.F. DeMenthon and L.S. Davis, "Model-Based Object Pose in 25 Lines of Code," Proc. Second European Conf. Computer Vision, Santa Margherita Ligure, Italy. Springer-Verlag, 1992.
[4] M.A. Fischler and R.C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Graphics and Image Processing, vol. 24, no. 6, pp. 381–395, June 1981.
[5] W.L. Grimson,D.P. Huttenlocher,, and T.D. Alter,“Recognizing 3D objects from 2D images: An error analysis,” Proc. CVPR 1992, pp. 316-321.
[6] R. Horaud, B. Conio, O. Leboulleux, and B. Lacolle, “An Analytic Solution for the Perspective 4-Point Problem,” Computer Vision, Graphics, and Image Processing, vol. 47, pp. 33–44, 1989.
[7] D.P. Huttenlocher and S. Ullman, “Recognizing Solid Objects by Alignment with an Image,” Int'l J. Computer Vision, vol. 5, no. 2, pp. 195-212, 1990.
[8] T.Q. Phong, R. Horaud, A. Yassine, and D.T. Pham, "Optimal Estimation of Object Pose from a Single Perspective View," Proc. Fourth Int'l Conf. Computer Vision, pp. 534-539,Berlin, 1993.
[9] S. Ullman and R. Basri, "Recognition by Linear Combinations of Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 992-1006, 1991.
[10] D. Weinshall,“Model based invariants for 3D vision,” Int’l J. Computer Vision, vol. 10, no. 1, pp. 27-42, 1993.
[11] J.S.C. Yuan, “A General Phogrammetric Solution for the Determining Object Position and Orientation,” IEEE Trans. Robotics and Automation, vol. 5, no. 2, pp. 129–142, Apr. 1989.

Index Terms:
Affine deformations, 3D-to-2D metric, object recognition, exterior orientation calibration.
Ronen Basri, Daphna Weinshall, "Distance Metric Between 3D Models and 2D Images for Recognition and Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 4, pp. 465-470, April 1996, doi:10.1109/34.491630
Usage of this product signifies your acceptance of the Terms of Use.