Issue No. 05 - May (2013 vol. 35)
Xindong Wu , Sch. of Comput. Sci. & Inf. Eng., Hefei Univ. of Technol., Hefei, China
Kui Yu , Sch. of Comput. Sci. & Inf. Eng., Hefei Univ. of Technol., Hefei, China
Wei Ding , Dept. of Comput. Sci., Univ. of Massachusetts, Boston, MA, USA
Hao Wang , Sch. of Comput. Sci. & Inf. Eng., Hefei Univ. of Technol., Hefei, China
Xingquan Zhu , Centre for Quantum Comput. & Intell. Syst., Univ. of Technol., Sydney, Sydney, NSW, Australia
We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.
Markov processes, Redundancy, Algorithm design and analysis, Prediction algorithms, Training, Accuracy, supervised learning, Feature selection, streaming features
Xingquan Zhu, Wei Ding, Kui Yu, Hao Wang and Xindong Wu, "Online Feature Selection with Streaming Features," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 35, no. , pp. 1178-1192, 2013.