The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2014 vol.36)
pp: 127-139
Anne Hendrikse , Signals & Syst. Group, Univ. of Twente, Overijsel, Netherlands
Raymond Veldhuis , Signals & Syst. Group, Univ. of Twente, Overijsel, Netherlands
Luuk Spreeuwers , Signals & Syst. Group, Univ. of Twente, Overijsel, Netherlands
ABSTRACT
The increase of the dimensionality of data sets often leads to problems during estimation, which are denoted as the curse of dimensionality. One of the problems of second-order statistics (SOS) estimation in high-dimensional data is that the resulting covariance matrices are not full rank, so their inversion, for example, needed in verification systems based on the likelihood ratio, is an ill-posed problem, known as the singularity problem. A classical solution to this problem is the projection of the data onto a lower dimensional subspace using principle component analysis (PCA) and it is assumed that any further estimation on this dimension-reduced data is free from the effects of the high dimensionality. Using theory on SOS estimation in high-dimensional spaces, we show that the solution with PCA is far from optimal in verification systems if the high dimensionality is the sole source of error. For moderate dimensionality, it is already outperformed by solutions based on euclidean distances and it breaks down completely if the dimensionality becomes very high. We propose a new method, the fixed-point eigenwise correction, which does not have these disadvantages and performs close to optimal.
INDEX TERMS
fixed-point eigenvalue correction, High-dimensional verification, eigenvalue bias correction, variance correction, euclidean distance, principle component analysis, MarÄ'enko Pastur equation, eigenwise correction,
CITATION
Anne Hendrikse, Raymond Veldhuis, Luuk Spreeuwers, "Likelihood-Ratio-Based Verification in High-Dimensional Spaces", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 1, pp. 127-139, Jan. 2014, doi:10.1109/TPAMI.2013.93
REFERENCES
[1] X. Tan and B. Triggs, "Fusing Gabor and Lbp Feature Sets for Kernel-Based Face Recognition," Proc. Third Int'l Conf. Analysis and Modeling of Faces and Gestures, pp. 235-249, 2007.
[2] O. Ledoit and M. Wolf, "Improved Estimation of the Covariance Matrix of Stock Returns with an Application to Portfolio Selection," J. Empirical Finance, vol. 10, no. 5, pp. 603-621, Dec. 2003.
[3] X. Mestre, "Estimating the Eigenvalues and Associated Subspaces of Correlation Matrices from a Small Number of Observations," Proc. Second Int'l Symp. Comm., Control, and Signal Processing, 2006.
[4] R.D. Uriarte and S.A. de Andres, "Gene Selection and Classification of Microarray Data Using Random Forest," BMC Bioinformatics, vol. 7, no. 1, pp. 3-15, 2006.
[5] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, second ed. Wiley-Interscience, Nov. 2001.
[6] G.V. Trunk, "A Problem of Dimensionality: A Simple Example," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, no. 3, pp. 306-307, July 1979.
[7] A.K. Jain, R.P.W. Duin, and J. Mao, "Statistical Pattern Recognition: A Review," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, Jan. 2000.
[8] R.E. Bellman, Adaptive Control Processess—A Guided Tour. Princeton Univ. Press, 1961.
[9] X. Jiang, "Asymmetric Principal Component and Discriminant Analyses for Pattern Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 5, pp. 931-937, May 2009.
[10] F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, and W.A. Stahel, Robust Statistics: The Approach Based on Influence Functions. John Wiley & Sons, 1986.
[11] M. Hubert, P.J. Rousseeuw, and K. van den Branden, "ROBPCA: A New Approach to Robust Principal Component Analysis," Technometrics, vol. 47, pp. 64-79, 2005.
[12] A.J. Hendrikse, R.N.J. Veldhuis, and L.J. Spreeuwers, "The Effect of Position Sources on Estimated Eigenvalues in Intensity Modeled Data," Proc. 31th Symp. Information Theory Benelux, pp. 105-112, 2010.
[13] V.A. Marčenko and L.A. Pastur, "Distribution of Eigenvalues for Some Sets of Random Matrices," Math. USSR—Sbornik, vol. 1, no. 4, pp. 457-483, 1967.
[14] J.W. Silverstein, "Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices," J. Multivariate Analysis, vol. 55, no. 2, pp. 331-339, 1995.
[15] J. Särelä and R. Vigário, "Overlearning in Marginal Distribution-Based ICA: Analysis and Solutions," J. Machine Learning Research, vol. 4, no. 7/8, pp. 1447-1469, 2004.
[16] T.W. Anderson, An Introduction to Multivariate Statistical Analysis, second ed. John Wiley & Sons, 1984.
[17] P.N. Belhumeur, J.P. Hespanha, and D.J. Kriegman, "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.
[18] X. Jiang, "Linear Subspace Learning-Based Dimensionality Reduction," IEEE Signal Processing Magazine, vol. 28, no. 2, pp. 16-26, Mar. 2011.
[19] D.K. Dey and C. Srinivasan, "Estimation of a Covariance Matrix under Stein's Loss," Annals of Statistics, vol. 13, no. 4, pp. 1581-1591, 1985.
[20] T. Takeshita and J. Ichiro Toriwaki, "Experimental Study of Performance of Pattern Classifiers and the Size of Design Samples," Pattern Recognition Letters, vol. 16, no. 3, pp. 307-312, 1995.
[21] S. Srivatava, "Distribution-Based Bayesian Minimum Expected Risk for Discriminant Analysis," Proc. IEEE Int'l Symp. Information Theory, pp. 2294-2298, July 2006.
[22] Z.D. Bai and H. Saranadasa, "Effect of High Dimension: By an Example of a Two Sample Problem," Statistica Sinica, vol. 6, pp. 311-329, 1996.
[23] R. Couillet and M. Debbah, Random Matrix Methods for Wireless Communications. Cambridge Univ. Press, 2011.
[24] G. Casella and R.L. Berger, Statistical Inference, second ed. Thomson Learning, 2002.
[25] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press Professional, Inc., 1990.
[26] X. Jiang, B. Mandal, and A. Kot, "Eigenfeature Regularization and Extraction in Face Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 3, pp. 383-394, Mar. 2008.
[27] J. Neyman and E.S. Pearson, "On the Problem of the Most Efficient Tests of Statistical Hypotheses," Philosophical Trans. Royal Soc. London, vol. 231, nos. 694-706, pp. 289-337, 1933.
[28] M. Soltane, N. Doghmane, and N. Guersi, "Face and Speech Based Multi-Modal Biometric Authentication," Int'l J. Advanced Science and Technology, vol. 21, Aug. 2010.
[29] D. Middleton, An Introduction to Statistical Communication Theory. McGraw-Hill, 1960.
[30] E. Jaynes, "On the Rationale of Maximum Entropy Methods," Proc. IEEE, special issue on spectral estimation, vol. 70, no. 9, pp. 939-952, Sept. 1982.
[31] D. Tolhurst, Y. Tadmor, and T. Chao, "The Amplitude Spectra of Natural Images," Ophthalmic and Physiological Optics, vol. 12, pp. 229-232, 1992.
[32] S.A.R.P. Millane and W.H. Hsiao, "Scaling and Power Spectra of Natural Images," Proc. Image and Vision Computing, 2003.
[33] J. Baik and J.W. Silverstein, "Eigenvalues of Large Sample Covariance Matrices of Spiked Population Models," J. Multivariate Analysis, vol. 97, no. 6, pp. 1382-1408, 2006.
[34] D. Paul, "Asymptotics of the Leading Sample Eigenvalues for a Spiked Covariance Model," technical report, Dept. of Statistics, Stanford Univ., 2004.
[35] I.M. Johnstone, "On the Distribution of the Largest Principle Component," technical report, Dept. of Statistics, Stanford Univ., 2000.
[36] G. Pan, "Strong Convergence of the Empirical Distribution of Eigenvalues of Sample Covariance Matrices with a Perturbation Matrix," J. Multivariate Analysis, vol. 101, pp. 1330-1338, July 2010.
[37] A.J. Hendrikse, L.J. Spreeuwers, and R.N.J. Veldhuis, "Eigenvalue Correction Results in Face Recognition," Proc. 29th Symp. Information Theory Benelux, pp. 27-35, 2008.
[38] B.B. Chen and G.M. Pan, "Convergence of the Largest Eigenvalue of Normalized Sample Covariance Matrices When p and n Both Tend to Infinity with their Ratio Converging to Zero," Bernoulli, vol. 18, pp. 1405-1420, 2012.
[39] A. Hendrikse, R. Veldhuis, and L. Spreeuwers, "Improved Variance Estimation along Sample Eigenvectors," Proc. 30th Symp. Information Theory Benelux, pp. 25-32, 2009.
[40] O. Ledoit and S. Péché, "Eigenvectors of Some Large Sample Covariance Matrices Ensembles," Technical Report iewwp407, Inst. of Empirical Research in Eco nomics, Mar. 2009.
[41] A.J. Hendrikse, L.J. Spreeuwers, and R.N.J. Veldhuis, "Notes on Second Order Statistics in Verification," technical report, Univ. of Twente, Enschede, The Netherlands, 2011.
[42] N. El Karoui, "Spectrum Estimation for Large Dimensional Covariance Matrices Using Random Matrix Theory," Annals of Statistics, vol. 36, no. 6, pp. 2757-2790, 2008.
[43] A.J. Hendrikse, R.N.J. Veldhuis, and L.J. Spreeuwers, "Smooth Eigenvalue Estimation," technical report, Univ. of Twente, Enschede, The Netherlands, 2011.
[44] A.J. Hendrikse, R.N.J. Veldhuis, L.J. Spreeuwers, and A.M. Bazen, "Analysis of Eigenvalue Correction Applied to Biometrics," Proc. Third Int'l Conf. Advances in Biometrics, pp. 189-198, June 2009.
[45] A. Hendrikse, R. Veldhuis, and L. Spreeuwers, "Verification under Increasing Dimensionality," Proc. Int'l Conf. Pattern Recognition, pp. 589-592, 2010.
[46] X. Jiang, B. Mandal, and A. Kot, "Enhanced Maximum Likelihood Face Recognition," Electronics Letters, vol. 42, no. 19, pp. 1089-1090, 2006.
[47] A.J. Hendrikse, R.N.J. Veldhuis, and L.J. Spreeuwers, "The Effect of Position Sources on Estimated Eigenvalues in Intensity Modeled Data," Proc. 31st Symp. Information Theory Benelux, pp. 105-112, 2010.
[48] S. Serneels and T. Verdonck, "Principal Component Analysis for Data Containing Outliers and Missing Elements," Computational Statistics Data Analysis, vol. 52, no. 3, pp. 1712-1727, 2008.
43 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool