The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - Sept.-Oct. (2012 vol.9)
pp: 1529-1534
Bo Liao , Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
Xiong Li , Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
Wen Zhu , Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
Zhi Cao , Coll. of Inf. Sci. & Eng., Hunan Univ., Changsha, China
ABSTRACT
The association studies between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes have recently received great attention. However, these studies are limited by the cost of genotyping all SNPs. Therefore, it is essential to find a small subset of tag SNPs representing the rest of the SNPs. The presence of linkage disequilibrium between tag SNPs and the disease variant (genotyped or not), may allow fine mapping study. In this paper, we combine a nearest-means classifier (NMC) and ant colony algorithm to select tags. Results show that our method (ACO/NMC) can get a similar prediction accuracy with method BPSO/SVM and is better than BPSO/STAMPA for small data sets. For large data sets, although the prediction accuracy of our method is lower than BPSO/SVM, ACO/ NMC can reach a high accuracy (>;99 percent) in a relatively short time. when the number of tags increases, the time complexity of NMC is nearly linear growth. To find out that the ability of tags to locate disease locus, we simulate a case-control study and use two-locus haplotype analysis to quantitatively assess the power. The result showed that 20 percent of all SNPs selected by NMC have about 10 percent higher power than random tags, on average.
INDEX TERMS
support vector machines, biology computing, diseases, genetics, molecular biophysics, pattern classification, polymorphism, two-locus haplotype analysis, genetic association, complex diseases, single nucleotide polymorphisms, linkage disequilibrium, genotyped disease variant, nearest-means classifier, ant colony algorithm, BPSO-SVM, BPSO-STAMPA, case-control study, Diseases, Accuracy, Support vector machines, Bioinformatics, Genomics, Prediction algorithms, tag selection., Haplotypes, single nucleotide polymorphism, informative SNP
CITATION
Bo Liao, Xiong Li, Wen Zhu, Zhi Cao, "A Novel Method to Select Informative SNPs and Their Application in Genetic Association Studies", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 5, pp. 1529-1534, Sept.-Oct. 2012, doi:10.1109/TCBB.2012.70
REFERENCES
[1] P.H.L.J. Joosten, M. Toepoel, E.C.M. Mariman, and E.J.J.V. Zoelen, "Promoter Haplotype Combinations of the Platelet-Derived Growth Factor α-Receptor Gene Predispose to Human Neural Tubedefects," Nature Genetics, vol. 27, pp. 215-217, 2001.
[2] J.H. Bennett, "On the Theory of Random Mating," Annals of Eugenics, vol. 18, pp. 311-317, 1954.
[3] D. Fallin, A. Cohen, L. Essioux, I. Chumakov, M. Blumenfeld, D. Cohen, and N.J. Schork, "Genetic Analysis of Case/Control Data Using Estimated Haplotype Frequencies: Application to APOE Locus Variation and Alzheimer's Disease," Genome Research, vol. 11, pp. 143-151, 2001.
[4] S.Y. Su, J.E. Asher, M.R. Jarvelin, P. Froguel, A.I.F. Blakemore, D.J. Balding, and L.J.M. Coin, "Inferring Combined CNV/SNP Haplotypes from Genotype Data," Bioinformatics, vol. 26, pp. 1437-1445, 2010.
[5] T.D. Wu and S. Nacu, "Fast and SNP-tolerant Detection of Complex Variants and Splicing in Short Reads," Bioinformatics, vol. 26, pp. 873-881, 2010.
[6] N. Malhis, Y.S.N. Butterfield, M. Ester, and S.J.M. Jones, "Slider— Maximum Use of Probability Information for Alignment of Short Sequence Reads and SNP Detection," Bioinformatics, vol. 25, pp. 6-13, 2009.
[7] C.K. Ting, W.T. Lin, and Y.T. Huang, "Multi-Objective Tag SNPs Selection Using Evolutionary Algorithms," Bioinformatics, vol. 26, pp. 1446-1452, 2010.
[8] E. Halperin, G. Kimmel, and R. Shamir, "Tag SNP Selection in Genotype Data for Maximizing SNP Prediction Accuracy," Bioinformatics, vol. 21, pp. 195-i203, 2005.
[9] J.W. He and A. Zelikovsky, "Informative SNP Selection Methods Based on SNP Prediction," IEEE Trans. Nanobioscience, vol. 6, no. 1, pp. 60-67, Mar. 2007.
[10] K. Zhang, P. labrese, M. Nordborg, and F.H. Sun, "Haplotype Block Structure and Its Applications to Association Studies: Power and Study Designs," Am. J. Human Genetics, vol. 71, pp. 1386-1394, 2002.
[11] K. Zhang, M. Deng, T. Chen, M.S. Waterman, and F.Z. Sun, "A Dynamic Programming Algorithm for Haplotype Block Partitioning," Proc Nat'l Academy of Sciences USA, vol. 99, pp. 7335-9, 2002.
[12] C.J. Chang, Y.T. Huang, and K.M. Chao, "A Greedier Approach for Finding Tag SNPs," Bioinformatics, vol. 22, pp. 685-691, 2006.
[13] Z.S. Qin, S. Gopalakrishnan, and G.R. Abecasis, "An Efficient Comprehensive Search Algorithm for Tagsnp Selection Using Linkage Disequilibrium Criteria," Bioinformatics, vol. 22, pp. 220-225, 2006.
[14] L. Liu, Y.H. Wu, S. Lonardi, and T. Jiang, "Efficient Algorithms for Genome-Wide Tagsnps Selection Across Populations via Linkage Disequilibrium Criterion," Proc. Sixth Ann. Int'l Conf. Computational Systems Bioinformatics, pp. 67-78, 2007.
[15] K. Hao, "Genome-Wide Selection of Tag SNPs Using Multiple-Marker Correlation," Bioinformatics, vol. 23, pp. 3178-3184, 2007.
[16] W.B. Wang and T. Jiang, "A New Model of Multi-Marker Correlation for Genome-Wide Tag SNP Selection," Proc. Int'l Conf. Genome Informatics, 2008.
[17] L.Y. Chuang, C.S. Yang, C.H. Hsuan, and C.H. Yang, "Tag SNP Selection Using Particle Swarm Optimization," Biotechnology Progress, vol. 26, pp. 580-588, 2010.
[18] T.M. Cover and P.E. Hart, "Nearest Neighbor Pattern Classification," IEEE Trans. Information Theory, vol. IT-13, no. 1, pp. 21-27, Jan. 1967.
[19] M. Dorigo, G.D. Caro, and L.M. Gambardella, "Ant Algorithms for Discrete Optimization," Artificial Life, vol. 5, pp. 137-172, 1999.
[20] K. Liu and S.V. Muse, "PowerMarker: An Integrated Analysis Environment for Genetic Marker Analysis," Bioinformatics, vol. 21, pp. 2128-2129, 2005.
[21] D.V. Zaykin, P.H. Westfall, S.S. Young, M.A. Karnoub, M.J. Wagner, and M.G. Ehm, "Testing Association of Statistically Inferred Haplotypes with Discrete and Continuous Traits in Samples of Unrelated Individuals," Human Heredity, vol. 53, pp. 79-91, 2002.
[22] C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, vol. 20, pp. 273-297, 1995.
[23] Int'l HapMap Consortium, "The International HapMap Project," Nature, vol. 426, pp. 789-796, 2003.
[24] X. Zhang, F. Zou, and W. Wang, "Efficient Algorithms for Genome-Wide Association Study," ACM Trans. Knowledge Discovery from Data, vol. 3, Article 19, 2009.
[25] A. Kelemen, A.V. Vasilakos, and Y.L. Liang, "Computational Intelligence in Bioinformatics: SNP/Haplotype Data in Genetic Association Study for Common Diseases," IEEE Trans. Information Technology in Biomedecine, vol. 13, no. 5, pp. 841-847, Sept. 2009.
107 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool