This Article 
 Bibliographic References 
 Add to: 
Image Representations and Feature Selection for Multimedia Database Search
July/August 2003 (vol. 15 no. 4)
pp. 911-920

Abstract—The success of a multimedia information system depends heavily on the way the data is represented. Although there are “natural” ways to represent numerical data, it is not clear what is a good way to represent multimedia data, such as images, video, or sound. In this paper, we investigate various image representations where the quality of the representation is judged based on how well a system for searching through an image database can perform—although the same techniques and representations can be used for other types of object detection tasks or multimedia data analysis problems. The system is based on a machine learning method used to develop object detection models from example images that can subsequently be used for examples to detect—search—images of a particular object in an image database. As a base classifier for the detection task, we use support vector machines (SVM), a kernel-based learning method. Within the framework of kernel classifiers, we investigate new image representations/kernels derived from probabilistic models of the class of images considered and present a new feature selection method which can be used to reduce the dimensionality of the image representation without significant losses in terms of the performance of the detection-search-system.

[1] N. Alon, S. Ben-David, N. Cesa-Bianchi, and D. Haussler, Scale-Sensitive Dimensions, Uniform Convergence, and Learnability Proc. Symp. Foundations of Computer Science, 1993.
[2] M. Betke and N. Makris, Fast Object Recognition in Noisy Images Using Simulated Annealing Proc. Fifth Int'l Conf. Computer Vision, pp. 523-530, 1995.
[3] O. Chapelle and V. Vapnik, Model Selection for Support Vector Machines Advances in Neural Information Processing Systems, 1999.
[4] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines. Cambridge Univ. Press, 2000.
[5] F. Cucker and S. Smale, On the Matematical Foundations of Learning Bull. Am. Math. Soc., 2002.
[6] T. Evgeniou, M. Pontil, and T. Poggio, Regularization Networks and Support Vector Machines Advances in Computational Math. vol. 13, pp. 1-50, 2000.
[7] A. Gersho and R.M. Gray, Vector Quantization and Signal Compression. Boston: Kluwer Academic, 1991.
[8] F. Girosi and N. Chan, Prior Knowledge and the Creation of `Virtual' Examples for RBF Networks Neural Networks for Signal Processing, Proc. 1995 IEEE-SP Workshop, pp. 201-210, 1995.
[9] D. Haussler, Convolution Kernels on Discrete Structures Technical Report UCSC-CRL-99-10, Univ. of California, Santa Cruz, 1999.
[10] B. Heisele, U. Kressel, and W. Ritter, Tracking Non-Rigid, Moving Objects Based on Color Cluster Flow Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1997.
[11] D. Hogg, Model-Based Vision: A Program to See a Walking Person Image and Vision Computing, vol. 1, no. 1, pp. 5-20, 1983.
[12] T. Jaakkola and D. Haussler, Probabilistic Kernel Regression Models Proc. Neural Information Processing Conf., 1998.
[13] A.K. Jain, Fundamentals of Digital Image Processing. Prentice-Hall Information and System Sciences Series, 1989.
[14] T. Joachims, Text Categorization with Support Vector Machines Technical Report LS-8 Report 23, Univ. of Dortmund, Nov. 1997.
[15] M. Kearns and R.E. Shapire, Efficient Distribution-Free Learning of Probabilistic Concepts J. Computer and Systems Sciences, vol. 48, no. 3, pp. 464-497, 1994.
[16] M.K. Leung and Y-H. Yang, Region Based Approach for Human Body Analysis Pattern Recognition, vol. 20, no. 3, pp. 321-339, 1987.
[17] S. McKenna and S. Gong, Nonintrusive Person Authentication for Access Control by Visual Tracking and Face Recognition Audio- and Video-based Biometric Person Authentication, J. Bigun, G. Chollet, and G Borgefors, eds., pp. 177-183, 1997.
[18] B. Moghaddam and A. Pentland, Probabilistic Visual Learning for Object Detection Technical Report 326, Mitmedia, 1995.
[19] P. Niyogi, C. Burges, and P. Ramesh, Distinctive Feature Detection Using Support Vector Machines Proc. Int'l Conf. Acusustic, Speech, and Signal Processing, 1999.
[20] M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and T. Poggio, Pedestrian Detection Using Wavelet Templates Proc. Computer Vision and Pattern Recognition, pp. 193-199, June 1997.
[21] E. Osuna, R. Freund, and F. Girosi, An Improved Training Algorithm for Support Vector Machines Proc. IEEE Workshop Neural Networks and Signal Processing, Sept. 1997.
[22] C. Papageorgiou, F. Girosi, and T. Poggio, Sparse Correlation Kernel Based Signal Reconstruction Technical Report 1635, Artificial Intelligence Laboratory, Massachusetts Inst. of Tech nology, 1998.
[23] C. Papageorgiou, M. Oren, and T. Poggio, Trainable System for Object Detection Int'l J. Computer Vision, vol. 38, no. 1, pp. 15-33, 2000.
[24] M. Pontil and A. Verri, Object Recognition with Support Vector Machines IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, pp. 637-646, 1998.
[25] M. Riesenhuber and T. Poggio, Models of Object Recognition Nature Neuroscience, vol. 3 (supplemental), pp. 1199-1204, 2000.
[26] K. Rohr, Incremental Recognition of Pedestrians from Image Sequences Computer Vision and Pattern Recognition, pp. 8-13, 1993.
[27] H. Rowley, S. Baluja, and T. Kanade, Face Detection in Visual Scenes Technical Report 95-158, Carnegie Mellon Univ., July 1995.
[28] B. Scholkopf, C. Burges, and A. Smola, Advances in Kernel Methods Support Vector Learning. MIT Press, 1998.
[29] B. Scholkpof, P. Simard, A. Smola, and V. Vapnik, Prior Knowledge in Suport Vector Kernels Advances in Neural Information Processing Systems 9, 1997.
[30] K-K. Sung and T. Poggio, Example-Based Learning for View-Based Human Face Detection Proc. from Image Understanding Workshop, Nov. 1994.
[31] T. Tsukiyama and Y. Shirai, Detection of the Movements of Persons from a Sparse Sequence of TV Images Pattern Recognition, vol. 18, nos. 3,4, pp. 207-213, 1985.
[32] M. Turk and A. Pentland, "Face Recognition Using Eigenfaces," Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 1991, pp. 586-591.
[33] R. Vaillant, C. Monrocq, and Y. Le Cun, Original Approach for the Localisation of Objects in Images IEEE Proc. Visual Image Signal Process., vol. 141, no. 4, Aug. 1994.
[34] V.N. Vapnik, Statistical Learning Theory. Wiley, 1998.
[35] V.N. Vapnik and A.Y. Chervonenkis, The Uniform Convergence of Relative Frequences of Events to Their Probabilities Theoretical Probability and Its Applications, vol. 17, no. 2, pp. 264-280, 1971.
[36] R.C. Veltkamp and M. Tanase, Content-Based Image Retrieval Systems: A Survey Technical Report UU-CS-2000-34, Dept. of Computing Science, Utrecht Univ., Netherlands, 2000.
[37] T. Vetter, T. Poggio, and H. Bülthoff, The Importance of Symmetry and Virtual Views in Three-Dimensional Object Recognition Current Biology, vol. 4, no. 1, pp. 18-23, 1994.
[38] G. Wahba, Splines Models for Observational Data, series in Applied Mathematics. vol. 59, Philadelphia: SIAM, 1990.
[39] A. Yuille, P. Hallinan, and D. Cohen, Feature Extraction from Faces Using Deformable Templates Int'l J. Computer Vision, vol. 8, no. 2, pp. 99-111, 1992.
[40] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, Pfinder: Real-Time Tracking of the Human Body Technical Report 353, MIT Media Laboratory, 1995.

Index Terms:
Machine learning, object detection, support vector machines, image representation, multimedia data search.
Theodoros Evgeniou, Massimiliano Pontil, Constantine Papageorgiou, Tomaso Poggio, "Image Representations and Feature Selection for Multimedia Database Search," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 4, pp. 911-920, July-Aug. 2003, doi:10.1109/TKDE.2003.1209008
Usage of this product signifies your acceptance of the Terms of Use.