CSDL Home IEEE Transactions on Pattern Analysis & Machine Intelligence 2009 vol.31 Issue No.12 - December
Issue No.12 - December (2009 vol.31)
Dimitrios Ververidis , Aristotle University of Thessaloniki, Thessaloniki
Constantine Kotropoulos , Aristotle University of Thessaloniki, Thessaloniki
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2009.84
When an infinite training set is used, the Mahalanobis distance between a pattern measurement vector of dimensionality D and the center of the class it belongs to is distributed as a \chi^2 with D degrees of freedom. However, the distribution of Mahalanobis distance becomes either Fisher or Beta depending on whether cross validation or resubstitution is used for parameter estimation in finite training sets. The total variation between \chi^2 and Fisher, as well as between \chi^2 and Beta, allows us to measure the information loss in high dimensions. The information loss is exploited then to set a lower limit for the correct classification rate achieved by the Bayes classifier that is used in subset feature selection.
Bayes classifier, Gaussian distribution, Mahalanobis distance, feature selection, cross validation.
Dimitrios Ververidis, Constantine Kotropoulos, "Information Loss of the Mahalanobis Distance in High Dimensions: Application to Feature Selection", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 12, pp. 2275-2281, December 2009, doi:10.1109/TPAMI.2009.84