Issue No. 05 - May (2008 vol. 20)
High-dimensional data and small sample size problem occur in many modern pattern classification applications, such as face recognition and gene expression data analysis. To deal with such data, an important step is the dimensionality reduction. Principal component analysis(PCA) and between-group analysis(BGA) are two commonly used methods and various extensions exist. The principle of these two approaches comes from their best approximation. From a pattern recognition perspective we show that PCA based on total-scatter matrix preserves linear separability and BGA based on between-scatter matrix retains only the distances between class centroid. Moreover we propose a novel uncorrelated discriminant analysis (UDA) algorithm. It combines rank preserving dimensionality reduction and constraint discriminant analysis, and serves as a simple and complete solution for small sample size problem. We conduct a series of comparative study on face images and gene expression data sets to evaluate UDA in terms of classification accuracy and robustness.
Feature extraction or construction, Pattern Recognition
W. Yang, H. Yan and D. Dai, "Feature Extraction and Uncorrelated Discriminant Analysis for High-Dimensional Data," in IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 601-614, 2007.