Issue No. 04 - October-December (2010 vol. 7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2009.8
Qiang Cheng , Southern Illinois University, Carbondale
Extracting features from high-dimensional data is a critically important task for pattern recognition and machine learning applications. High-dimensional data typically have much more variables than observations, and contain significant noise, missing components, or outliers. Features extracted from high-dimensional data need to be discriminative, sparse, and can capture essential characteristics of the data. In this paper, we present a way to constructing multivariate features and then classify the data into proper classes. The resulting small subset of features is nearly the best in the sense of Greenshtein's persistence; however, the estimated feature weights may be biased. We take a systematic approach for correcting the biases. We use conjugate gradient-based primal-dual interior-point techniques for large-scale problems. We apply our procedure to microarray gene analysis. The effectiveness of our method is confirmed by experimental results.
learning (artificial intelligence), biology computing, feature extraction, genetics
Q. Cheng, "A Sparse Learning Machine for High-Dimensional Data with Application to Microarray Gene Analysis," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 4, pp. 636-646, 2010.