CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2011 vol.33 Issue No.05 - May

Subscribe

Issue No.05 - May (2011 vol.33)

pp: 1037-1050

Wei Bian , University of Technology, Sydney

Dacheng Tao , University of Technology, Sydney

ABSTRACT

We propose a new criterion for discriminative dimension reduction, max-min distance analysis (MMDA). Given a data set with C classes, represented by homoscedastic Gaussians, MMDA maximizes the minimum pairwise distance of these C classes in the selected low-dimensional subspace. Thus, unlike Fisher's linear discriminant analysis (FLDA) and other popular discriminative dimension reduction criteria, MMDA duly considers the separation of all class pairs. To deal with general case of data distribution, we also extend MMDA to kernel MMDA (KMMDA). Dimension reduction via MMDA/KMMDA leads to a nonsmooth max-min optimization problem with orthonormal constraints. We develop a sequential convex relaxation algorithm to solve it approximately. To evaluate the effectiveness of the proposed criterion and the corresponding algorithm, we conduct classification and data visualization experiments on both synthetic data and real data sets. Experimental results demonstrate the effectiveness of MMDA/KMMDA associated with the proposed optimization algorithm.

INDEX TERMS

Fisher's linear discriminant analysis, dimension reduction, convex relaxation, data visualization, pattern classification.

CITATION

Wei Bian, Dacheng Tao, "Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction",

*IEEE Transactions on Pattern Analysis & Machine Intelligence*, vol.33, no. 5, pp. 1037-1050, May 2011, doi:10.1109/TPAMI.2010.189REFERENCES

- [1] P.N. Belhumeur, J.P. Hespanha, and D.J. Kriegman, “Eigenfaces vs Fisherfaces: Recognition Using Class Specific Linear Projection,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.- [2] X. He, S. Yan, Y. Hu, and P. Niyogi, “Face Recognition Using Laplacianfaces,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, Mar. 2005.- [3] J. Ye and Q. Li, “A Two-Stage Linear Discriminant Analysis via QR-Decomposition,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 929-941, June 2005.- [4] N. Pochet, F. De Smet, J.A.K. Suykens, and B.L.R. De Moor, “Systematic Benchmarking of Microarray Data Classification: Assessing the Role of Non-Linearity and Dimensionality Reduction,”
Bioinformatics, vol. 20, no. 17, pp. 3185-3195, 2004.- [5] J. Ye, R. Janardan, C.H. Park, and H. Park, “An Optimization Criterion for Generalized Discriminant Analysis on Undersampled Problems,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 982-994, Aug. 2004.- [6] W. Bian and D. Tao, “Biased Discriminant Euclidean Embedding for Content-Based Image Retrieval,”
IEEE Trans. Image Processing, vol. 19, no. 2, pp. 545-554, Feb. 2010.- [7] D. Song and D. Tao, “Biologically Inspired Feature Manifold for Scene Classification,”
IEEE Trans. Image Processing, vol. 19, no. 1, pp. 174-184, Jan. 2010.- [8] X. Tian, D. Tao, X.-S. Hua, and X. Wu, “Active Reranking for Web Image Search,”
IEEE Trans. Image Processing, vol. 19, no. 3, pp. 805-820, Mar. 2010.- [9] X. He, D. Cai, and J. Han, “Learning a Maximum Margin Subspace for Image Retrieval,”
IEEE Trans. Knowledge and Data Eng., vol. 20, no. 2, pp. 189-201, Feb. 2008.- [10] G. Potamianos and H.P. Graf, “Linear Discriminant Analysis for Speechreading,”
Proc. Workshop Multimedia Signal Process, pp. 221-226, 1998.- [11] E.I. Altman, “Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy,”
The J. Finance, vol. 23, no. 4, pp. 589-609, 1968.- [12] K. Kumar and S. Bhattacharya, “Artificial Neural Network vs Linear Discriminant Analysis in Credit Ratings Forecast: A Comparative Study of Prediction Performances,”
Rev. of Accounting and Finance, vol. 5, no. 3, pp. 216-227, Aug. 2006.- [13] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,”
Annals of Eugenics, vol. 7, pp. 179-188, 1936.- [14] C.R. Rao, “The Utilization of Multiple Measurements in Problems of Biological Classification,”
J. Royal Statistical Soc. Series B: Methodological, vol. 10, pp. 159-203, 1948.- [15] M. Loog, R. Duin, and R. Haeb-Umbach, “Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 7, pp. 762-766, July 2001.- [16] R. Lotlikar and R. Kothari, “Fractional-Step Dimensionality Reduction,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 6, pp. 623-627, June 2000.- [17] D. Tao, X. Li, X. Wu, and S.J. Maybank, “Geometric Mean for Subspace Selection,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 260-274, Feb. 2009.- [18] W. Bian and D. Tao, “Harmonic Mean for Subspace Selection,”
Proc. 19th Int'l Conf. Pattern Recognition, pp. 1-4, 2008.- [19] K. Fukunaga,
Introduction to Statistical Pattern Recognition, second ed. Academic Press, Sept. 1990.- [20] M. Loog and R.P.W. Duin, “Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 6, pp. 732-739, June 2004.- [21] M. Zhu and A.M. Martinez, “Subclass Discriminant Analysis,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1274-1286, Aug. 2006.- [22] M. Zhu and A.M. Martínez, “Pruning Noisy Bases in Discriminant Analysis,”
IEEE Trans. Neural Networks, vol. 19, no. 1, pp. 148-157, Jan. 2008.- [23] C.E. Thomaz and G.A. Giraldi, “A New Ranking Method for Principal Components Analysis and Its Application to Face Image Analysis,”
Image and Vision Computing, vol. 28, no. 6, pp. 902-913, 2010.- [24] T. Zhang, D. Tao, X. Li, and J. Yang, “Patch Alignment for Dimensionality Reduction,”
IEEE Trans. Knowledge and Data Eng., vol. 21, no. 9, pp. 1299-1313, Sept. 2009.- [25] W. Bian and D. Tao, “Manifold Regularization for SIR with Rate Root-n Convergence,”
Advances in Neural Information Processing Systems, MIT Press, 2009.- [26] Z. Tianyi, T. Dacheng, and W. Xingdong, “Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction,”
Data Mining and Knowledge Discovery, 2010.- [27] O.C. Hamsici and A.M. Martinez, “Bayes Optimality in Linear Discriminant Analysis,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 647-657, Apr. 2008.- [28] M. Schervish, “Linear Discrimination for Three Known Normal Populations,”
J. Statistical Planning and Inference, vol. 10, pp. 167-175, 1984.- [29] M. Loog and R.P.W. Duin, “Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 6, pp. 732-739, June 2004.- [30] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” Dept. of Information and Computer Sciences, Univ. of California, Irvine, 1998.
- [31] D.B. Graham and N.M. Allinson, “Characterizing Virtual Eigensignatures for General Purpose Face Recognition,”
Face Recognition: From Theory to Applications, H. Wechsler, P.J. Phillips, V. Bruce, F. Fogelman-Soulie, and T.S. Huang, eds., pp. 446-456, 1998.- [32] P.J. Phillips, H. Moon, S.A. Rizvi, and P.J. Rauss, “The Feret Evaluation Methodology for Face-Recognition Algorithms,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090-1104, Oct. 2000.- [33] T. Sim, S. Baker, and M. Bsat, “The CMU Pose, Illumination, and Expression (PIE) Database of Human Faces,” Technical Report CMU-RI-TR-01-02, Robotics Inst., Carnegie Mellon Univ., Jan. 2001.
- [34] B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,”
Neural Computation, vol. 10, no. 5, pp. 1299-1319, 1998.- [35] I. Jolliffe,
Principal Component Analysis, second ed. Springer, 2002.- [36] J. Dattorro,
Convex Optimization and Euclidean Distance Geometry. Meboo Publishing, 2008.- [37] A. d'Aspremont, L. El Ghaoui, M. Jordan, and G. Lanckriet, “A Direct Formulation of Sparse PCA Using Semidefinite Programming,”
SIAM Rev., vol. 49, no. 3, pp. 434-438, 2007.- [38] M.L. Overton and R.S. Womersley, “On the Sum of the Largest Eigenvalues of a Symmetric Matrix,”
SIAM J. Matrix Analysis and Applications, vol. 13, no. 1, pp. 41-45, 1992.- [39] K. Fujisawa, Y. Futakata, M. Kojima, S. Matsuyama, S. Nakamura, K. Nakata, and M. Yamashita, “SDPA-M (Semidefinite Programming Algorithm in Matlab) User's Manual—Version 6.2.0,” Technical Report B-359, Dept. of Math. and Computing Sciences, Tokyo Inst. of Tech nology, 2000.
- [40] M. Grant and S. Boyd, “Graph Implementations for Nonsmooth Convex Programs,”
Recent Advances in Learning and Control (A Tribute to M. Vidyasagar), V. Blondel, S. Boyd, and H. Kimura, eds., pp. 95-110, Springer, 2008.- [41] M. Grant and S. Boyd, “CVX: Matlab Software for Disciplined Convex Programming (Web Page and Software),” http://cvxr.comcvx, Apr. 2010.
- [42] M. Fazel, H. Hindi, and S.P. Boyd, “Log-Det Heuristic for Matrix Rank Minimization with Applications to Hankel and Euclidean Distance Matrices,”
Proc. Am. Control Conf., pp. 2156-2162, 2003.- [43] S.J. Kim, A. Magnani, and S. Boyd, “Optimal Kernel Selection in Kernel Fisher Discriminant Analysis,”
Proc. 23rd Int'l Conf. Machine Learning, pp. 465-472, 2006.- [44] D. You, O.C. Hamsici, and A.M. Martinez, “Kernel Optimization in Discriminant Analysis,”
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 3, pp. 631-638, Mar. 2011.- [45] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, “Neighbourhood Components Analysis,”
Advances in Neural Information Processing Systems, vol. 17, pp. 513-520, MIT Press, 2004.- [46] J.V. Davis, B. Kulis, P. Jain, S. Sra, and I.S. Dhillon, “Information-Theoretic Metric Learning,”
Proc. 24th Int'l Conf. Machine Learning, pp. 209-216, 2007.- [47] K.Q. Weinberger, J. Blitzer, and L.K. Saul, “Distance Metric Learning for Large Margin Nearest Neighbor Classification,”
Advances in Neural Information Processing Systems, MIT Press, 2006.- [48] A. Globerson and S. Roweis, “Metric Learning by Collapsing Classes,”
Advances in Neural Information Processing Systems, MIT Press, 2005.- [49] S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K.R. Mullers, “Fisher Discriminant Analysis with Kernels,”
Proc. IEEE Workshop Neural Networks for Signal Processing, 1999.- [50] D. Masip, L.I. Kuncheva, and J. Vitria, “An Ensemble-Based Method for Linear Feature Extraction for Two-Class Problems,”
Pattern Analysis and Applications, vol. 8, no. 3, pp. 227-237, 2005. |