Issue No. 04 - July/August (2011 vol. 8)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2010.85
Xin Zhao , Sanjole Inc., Honolulu
Leo Wang-Kit Cheung , Loyola University Medical Center, Maywood
Identifying significant differentially expressed genes of a disease can help understand the disease at the genomic level. A hierarchical statistical model named multiclass kernel-imbedded Gaussian process (mKIGP) is developed under a Bayesian framework for a multiclass classification problem using microarray gene expression data. Specifically, based on a multinomial probit regression setting, an empirically adaptive algorithm with a cascading structure is designed to find appropriate featuring kernels, to discover potentially significant genes, and to make optimal tumor/cancer class predictions. A Gibbs sampler is adopted as the core of the algorithm to perform Bayesian inferences. A prescreening procedure is implemented to alleviate the computational complexity. The simulated examples show that mKIGP performed very close to the Bayesian bound and outperformed the referred state-of-the-art methods in a linear case, a nonlinear case, and a case with a mislabeled training sample. Its usability has great promises to problems that linear-model-based methods become unsatisfactory. The mKIGP was also applied to four published real microarray data sets and it was very effective for identifying significant differentially expressed genes and predicting classes in all of these data sets.
Gene expression, Gaussian processes, Monte Carlo methods, nonlinear multiclass systems.
Xin Zhao, Leo Wang-Kit Cheung, "Multiclass Kernel-Imbedded Gaussian Processes for Microarray Data Analysis", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. , pp. 1041-1053, July/August 2011, doi:10.1109/TCBB.2010.85