This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification
April 2006 (vol. 28 no. 4)
pp. 578-593
To deal with the problem of insufficient labeled data in video object classification, one solution is to utilize additional pairwise constraints that indicate the relationship between two examples, i.e., whether these examples belong to the same class or not. In this paper, we propose a discriminative learning approach which can incorporate pairwise constraints into a conventional margin-based learning framework. Different from previous work that usually attempts to learn better distance metrics or estimate the underlying data distribution, the proposed approach can directly model the decision boundary and, thus, require fewer model assumptions. Moreover, the proposed approach can handle both labeled data and pairwise constraints in a unified framework. In this work, we investigate two families of pairwise loss functions, namely, convex and nonconvex pairwise loss functions, and then derive three pairwise learning algorithms by plugging in the hinge loss and the logistic loss functions. The proposed learning algorithms were evaluated using a people identification task on two surveillance video data sets. The experiments demonstrated that the proposed pairwise learning algorithms considerably outperform the baseline classifiers using only labeled data and two other pairwise learning algorithms with the same amount of pairwise constraints.

[1] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Int'l Conf. Computer Vision, Oct. 2003.
[2] G. Shakhnarovich, L. Lee, and T. Darrell, “Integrated Face and Gait Recognition from Multiple Views,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001.
[3] S. Antania, R. Kasturi, and R. Jain, “A Survey on the Use of Pattern Recognition Methods for Abstraction, Indexing and Retrieval of Images and Video,” Pattern Recognition, vol. 4, pp. 945-65, Apr. 2002.
[4] F. Li, R. Fergus, and P. Perona, “A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories,” Proc. Int'l Conf. Computer Vision, Oct. 2003.
[5] R. Yan, J. Yang, and A.G. Hauptmann, “Automatically: Labeling Data Using Multi-Class Active Learning,” Proc. Int'l Conf. Computer Vision, pp. 516-523, 2003.
[6] A. Pentland, B. Moghaddam, and T. Starner, “View-Based and Modular Eigenspaces for Face Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 568-574, June 1994.
[7] H. Gish and M. Schmidt, “Text-Independent Speaker Identification,” IEEE Signal Processing Magazine, vol. 11, no. 4, pp. 18-32, 1994.
[8] S.X. Yu and J. Shi, “Grouping with Directed Relationships,” Lecture Notes in Computer Science, vol. 2134, pp. 283-291, 2001.
[9] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained k-means Clustering with Background Knowledge,” Proc. 18th Int'l Conf. Machine Learning, pp. 577-584, 2001.
[10] S. Basu, A. Banerjee, and R.J. Mooney, “Active Semi-Supervision for Pairwise Constrained Clustering,” Proc. 20th Int'l Conf. Machine Learning, Aug. 2003.
[11] T. Lange, M.H. Law, A.K. Jain, and J. Buhmann, “Learning with Constrained and Unlabeled Data,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[12] N. Shental, A. Bar-Hillel, T. Hertz, and D. Weinshall, “Enhancing Image and Video Retrieval: Learning via Equivalence Constraints,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2003.
[13] J.T. Kwok and I.W. Tsang, “Learning with Idealized Kernel,” Proc. 20th Int'l Conf. Machine Learning, Aug. 2003.
[14] E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russel, “Distance Metric Learning with Applications to Clustering with Side Information,” Advances in Neural Information Processing Systems, 2002.
[15] S. Basu, M. Bilenko, and R.J. Mooney, “A Probabilistic Framework for Semi-Supervised Clustering,” Proc. ACM SIGKDD, pp. 59-68, 2004.
[16] J. Zhu and T. Hastie, “Kernel Logistic Regression and the Import Vector Machine,” Advances in Neural Information Processing Systems, 2001.
[17] V.N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[18] M. Hewish, “Automatic Target Recognition,” Int'l Defense Rev., vol. 24, no. 10, 1991.
[19] W.E. Pierson and T.D. Ross, “Automatic Target Recognition (ATR) Evaluation Theory: A Survey,” Proc. SPIE- Int'l Soc. for Optical Eng., vol. 4053, 2000.
[20] A. Jain, O. Trier, and T. Taxt, “Feature Extraction Methods for Character Recognition— A Survey,” Pattern Recognition, vol. 29, 1993.
[21] A.J. Comenarez and T.S. Huang, “Face Detection with Information-Based Maximum Discrimination,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1997.
[22] P. Perona, R. Fergus, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[23] A.L. Yuille, Z. Tu, X. Chen, and S. Zhu, “Image Parsing. Unifying Segmentation, Detection and Recognition,” Proc. IEEE Int'l Conf. Computer Vision, 2003.
[24] M.J. Jones, P. Viola, and D. Snow, “Detecting Pedestrians Using Patterns of Motion and Appearance,” Proc. IEEE Int'l Conf. Computer Vision, 2003.
[25] N. Shental, A. Bar-Hillel, T. Hertz, and D. Weinshall, “Computing Gaussian Mixture Models with EM Using Side Information,” Proc. Workshop The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, Aug. 2003.
[26] S. Kumar and M. Hebert, “Discriminative Random Fields: A Discriminative Framework for Contextual Interaction in Classification,” Proc. IEEE Int'l Conf. Computer Vision, 2003.
[27] J. Yang, R. Yan, J. Zhang, and A.G. Hauptmann, “A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification,” Proc. Int'l Conf. Computer Vision and Pattern Recognition, 2004.
[28] K. Nigam, A.K. McCallum, S. Thrun, and T.M. Mitchell, “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning, vol. 39, pp. 103-134, 2000.
[29] A. Dempster, N. Laird, and D. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, vol. 39, no. 1, pp. 1-38, 1977.
[30] A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with Co-Training,” Proc. Workshop Computational Learning Theory, 1998.
[31] J. Lafferty, X. Zhu, Z. Ghahramani, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. 20th Int'l Conf. Machine Learning, 2003.
[32] L. Xie and P. Pérez, “Slightly Supervised Learning of Part-Based Appearance Models,” Proc. IEEE Workshop Learning in Computer Vision and Pattern Recognition, in conjunction with CVPR, June 2004.
[33] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, Springer Series in Statistics. Springer Verlag, 2001.
[34] G. Kimeldorf and G. Wahba, “Some Results on Tchebycheffian Spline Functions,” J. Math. Analytic Applications, vol. 33, pp. 82-95, 1971.
[35] T.F. Coleman and Y. Li, “An Interior, Trust Region Approach for Nonlinear Minimization Subject to Bounds,” SIAM J. Optimization, vol. 6, pp. 418-445, 1996.
[36] J. Platt, “Fast Training of Support Vector Machines Using Sequential Minimal Optimization,” Advances in Kernel Methods— Support Vector Learning, C. Burges, B. Scholkopf, and A. Smola, eds. MIT Press, 1998.
[37] E.L. Allwein, R.E. Schapire, and Y. Singer, “Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers,” Proc. 17th Int'l Conf. Machine Learning, pp. 9-16, 2000.

Index Terms:
Video object classification, pairwise constraints, discriminative learning, margin-based learning.
Citation:
Rong Yan, Jian Zhang, Jie Yang, Alexander G. Hauptmann, "A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 578-593, April 2006, doi:10.1109/TPAMI.2006.65
Usage of this product signifies your acceptance of the Terms of Use.