Pattern Recognition, International Conference on (2010)
Aug. 23, 2010 to Aug. 26, 2010
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICPR.2010.378
Performance of a pattern recognition system depends strongly on the employed feature-selection method. We perform an in-depth analysis of two main measures used in the filter model: the correlation-feature-selection (CFS) measure and the minimal-redundancy-maximal-relevance (mRMR) measure. We show that these measures can be fused and generalized into a generic feature-selection (GeFS) measure. Further on, we propose a new feature-selection method that ensures globally optimal feature sets. The new approach is based on solving a mixed 0-1 linear programming problem (M01LP) by using the branch-and-bound algorithm. In this M01LP problem, the number of constraints and variables is linear ($O(n)$) in the number $n$ of full set features. In order to evaluate the quality of our GeFS measure, we chose the design of an intrusion detection system (IDS) as a possible application. Experimental results obtained over the KDD Cup'99 test data set for IDS show that the GeFS measure removes 93% of irrelevant and redundant features from the original data set, while keeping or yielding an even better classification accuracy.
H. T. Nguyen, K. Franke and S. Petrovic, "Towards a Generic Feature-Selection Measure for Intrusion Detection," 2010 20th International Conference on Pattern Recognition (ICPR 2010)(ICPR), Istanbul, 2010, pp. 1529-1532.