The Community for Technology Leaders
Green Image
Extracting features from high-dimensional data is a critically important task for pattern recognition and machine learning applications. High-dimensional data typically have much more variables than observations, and contain significant noise, missing components, or outliers. Features extracted from high-dimensional data need to be discriminative, sparse, and can capture essential characteristics of the data. In this paper, we present a way to constructing multivariate features and then classify the data into proper classes. The resulting small subset of features is nearly the best in the sense of Greenshtein's persistence; however, the estimated feature weights may be biased. We take a systematic approach for correcting the biases. We use conjugate gradient-based primal-dual interior-point techniques for large-scale problems. We apply our procedure to microarray gene analysis. The effectiveness of our method is confirmed by experimental results.
learning (artificial intelligence), biology computing, feature extraction, genetics,large-scale problems, sparse learning machine, high-dimensional data, microarray gene analysis, feature extraction, pattern recognition, machine learning, multivariate features, Greenshtein persistence, conjugate gradient-based primal-dual interior-point techniques,Machine learning, Pattern recognition, Support vector machines, Support vector machine classification, Data mining, Feature extraction, Bioinformatics, Pattern analysis, Large-scale systems, Cancer,High-dimensional data, feature selection, persistence, bias, convex optimization, primal-dual interior-point optimization, cancer classification, microarray gene analysis.,
"A Sparse Learning Machine for High-Dimensional Data with Application to Microarray Gene Analysis", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. , pp. 636-646, October-December 2010, doi:10.1109/TCBB.2009.8
191 ms
(Ver 3.3 (11022016))