Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'05)
GroupAdaBoost for Selecting Important Genes
Minneapolis, Minnesota
October 19-October 21
ISBN: 0-7695-2476-1
This paper proposes GroupAdaBoost as a variant of AdaBoost for statistical pattern recognition. The objective of the proposed algorithm is to solve the "p ≫ n" problem arisen in bioinformatics. Typically, p is the number of investigated genes and n is number of individuals in a microarray experiment for observing gene expressions in a problem to extract any specific pattern of gene expressions related to a disease status. The ordinary method for predicting the genetic causes of diseases is apt to over-learn from any particular training dataset because of facing "p ≫ n" problem. We observed that GroupAdaBoost gave a robust performance for cases of the excess number of genes. In several real datasets, which are publicly available from web-pages, we compared the analysis of results among the proposed method and others, and a small scale of simulation study to confirm the validity of the proposed method.
Citation:
Takashi Takenouchi, Masaru Ushijima, Shinto Eguchi, "GroupAdaBoost for Selecting Important Genes," bibe, pp.218-221, Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'05), 2005