The Community for Technology Leaders
Green Image
Issue No. 06 - November/December (2011 vol. 8)
ISSN: 1545-5963
pp: 1633-1641
Hong-Dong Li , Res. Center of Modernization of Traditional Chinese Medicines, Central South Univ., Changsha, China
Yi-Zeng Liang , Res. Center of Modernization of Traditional Chinese Medicines, Central South Univ., Changsha, China
Qing-Song Xu , Sch. of Math. Sci., Central South Univ., Changsha, China
Dong-Sheng Cao , Res. Center of Modernization of Traditional Chinese Medicines, Central South Univ., Changsha, China
Bin-Bin Tan , Res. Center of Modernization of Traditional Chinese Medicines, Central South Univ., Changsha, China
Bai-Chuan Deng , Res. Center of Modernization of Traditional Chinese Medicines, Central South Univ., Changsha, China
Chen-Chen Lin , Res. Center of Modernization of Traditional Chinese Medicines, Central South Univ., Changsha, China
ABSTRACT
Selecting a small number of informative genes for microarray-based tumor classification is central to cancer prediction and treatment. Based on model population analysis, here we present a new approach, called Margin Influence Analysis (MIA), designed to work with support vector machines (SVM) for selecting informative genes. The rationale for performing margin influence analysis lies in the fact that the margin of support vector machines is an important factor which underlies the generalization performance of SVM models. Briefly, MIA could reveal genes which have statistically significant influence on the margin by using Mann-Whitney U test. The reason for using the Mann-Whitney U test rather than two-sample t test is that Mann-Whitney U test is a nonparametric test method without any distribution-related assumptions and is also a robust method. Using two publicly available cancerous microarray data sets, it is demonstrated that MIA could typically select a small number of margin-influencing genes and further achieves comparable classification accuracy compared to those reported in the literature. The distinguished features and outstanding performance may make MIA a good alternative for gene selection of high dimensional microarray data. (The source code in MATLAB with GNU General Public License Version 2.0 is freely available at http://code.google.eom/p/mia2009/).
INDEX TERMS
Support vector machines, Input variables, Computational modeling, Analytical models, Biological system modeling, Cancer, Predictive models,model population analysis., Informative gene selection, cancer classification, support vector machines, margin
CITATION
Hong-Dong Li, Yi-Zeng Liang, Qing-Song Xu, Dong-Sheng Cao, Bin-Bin Tan, Bai-Chuan Deng, Chen-Chen Lin, "Recipe for uncovering predictive genes using support vector machines based on model population analysis", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. , pp. 1633-1641, November/December 2011, doi:10.1109/TCBB.2011.36
231 ms
(Ver )