The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan.-Feb. (2013 vol.10)
pp: 207-212
Xiang Wan , Dept. of Comput. Sci., Hong Kong Baptist Univ., Hong Kong, China
Can Yang , Div. of Biostat., Yale Univ., New Haven, CT, USA
Qiang Yang , Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Hongyu Zhao , Div. of Biostat., Yale Univ., New Haven, CT, USA
Weichuan Yu , Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
ABSTRACT
Genome-wide association study (GWAS) has been successful in identifying genetic variants that are associated with complex human diseases. In GWAS, multilocus association analyses through linkage disequilibrium (LD), named haplotype-based analyses, may have greater power than single-locus analyses for detecting disease susceptibility loci. However, the large number of SNPs genotyped in GWAS poses great computational challenges in the detection of haplotype associations. We present a fast method named HapBoost for finding haplotype associations, which can be applied to quickly screen the whole genome. The effectiveness of HapBoost is demonstrated by using both synthetic and real data sets. The experimental results show that the proposed approach can achieve comparably accurate results while it performs much faster than existing methods.
INDEX TERMS
Bioinformatics, Genomics, Diseases, Estimation, Testing, Computational biology, Educational institutions,linkage disequilibrium, SNP, haplotype, genome-wide association studies
CITATION
Xiang Wan, Can Yang, Qiang Yang, Hongyu Zhao, Weichuan Yu, "HapBoost: A Fast Approach to Boosting Haplotype Association Analyses in Genome-Wide Association Studies", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 1, pp. 207-212, Jan.-Feb. 2013, doi:10.1109/TCBB.2013.6
REFERENCES
[1] A. Clark, “The Role of Haplotypes in Candidate Gene Studies,” Genetic Epidemiology, vol. 27, pp. 321-333, 2004.
[2] Wikipedia, “Haplotype—Wikipedia, the Free Encyclopedia,” http://en. wikipedia.org/wikiHaplotype, 2004.
[3] J. Kang, S. Kugathasan, M. Georges, H. Zhao, and J. Cho, “Improved Risk Prediction for Crohn's Disease with a Multilocus Approach,” Human Molecular Genetics, vol. 20, no. 12, pp. 2435-2442, 2011.
[4] D. Schaid, C. Rowland, D. Tines, R. Jacobson, and G. Poland, “Score Tests for Association between Traits and Haplotypes When Linkage Phase Is Ambiguous,” The Am. J. Human Genetics, vol. 70, no. 2, pp. 425-434, 2002.
[5] D.O. Stram, C. Leigh Pearce, P. Bretsky, M. Freedman, J.N. Hirschhorn, D. Altshuler, L.N. Kolonel, B.E. Henderson, D.C. Thomas, “Modeling and E-M Estimation of Haplotype-Specific Relative Risks from Genotype Data for a Case-Control Study of Unrelated Individuals,” Human Heredity, vol. 55, pp. 179-190, 2003.
[6] L. Zhao, S. Li, and N. Khalid, “A Method for the Assessment of Disease Associations with Single-Nucleotide Polymorphism Haplotypes and Environmental Variables in Case-Control Studies,” Am. J. Human Genetics, vol. 72, no. 5, pp. 1231-1250, 2003.
[7] D. Lin, “An Efficient Monte Carlo Approach to Assessing Statistical Significance in Genomic Studies,” Bioinformatics, vol. 21, no. 6, pp. 781-787, 2005.
[8] A. Morris, “A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants,” Am. J. Human Genetics, vol. 79, no. 4, pp. 679-694, 2006.
[9] T. Druet and M. Georges, “A Hidden Markov Model Combining Linkage and Linkage Disequilibrium Information for Haplotype Reconstruction and Quantitative Trait Locus Fine Mapping,” Genetics, vol. 184, no. 3, pp. 789-798, 2010.
[10] A. Clark, “Inference of Haplotypes from PCR-Amplified Samples of Diploid Populations,” Molecular Biology and Evolution, vol. 7, no. 2, pp. 111-122, 1990.
[11] L. Excoffier and M. Slatkin, “Maximum-Likelihood Estimation of Molecular Haplotype Frequencies in a Diploid Population,” Molecular Biology and Evolution, vol. 12, no. 5, pp. 921-927, 1995.
[12] Y. Wang, Z. Cai, P. Stothard, S. Moore, R. Goebel, L. Wang, and G. Lin, “Fast Accurate Missing SNP Genotype Local Imputation,” BMC Research Notes, vol. 5, no. 1, article 404, 2012.
[13] S. Browning and B. Browning, “Haplotype Phasing: Existing Methods and New Developments,” Nature Rev. Genetics, vol. 12, no. 10, pp. 703-714, 2011.
[14] M. Epstein and G. Satten, “Inference on Haplotype Effects in Case-Control Studies Using Unphased Genotype Data,” Am. J. Human Genetics, vol. 73, no. 6, pp. 1316-1329, 2003.
[15] M. Stephens and P. Donnelly, “A Comparison of Bayesian Methods for Haplotype Reconstruction from Population Genotype Data,” Am. J. Human Genetics, vol. 73, no. 5, pp. 1162-1169, 2003.
[16] P. Scheet and M. Stephens, “A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase,” Am. J. Human Genetics, vol. 78, no. 4, pp. 629-644, 2006.
[17] S. Browning and B. Browning, “Rapid and Accurate Haplotype Phasing and Missing-Data Inference for Whole-Genome Association Studies by Use of Localized Haplotype Clustering,” Am. J. Human Genetics, vol. 81, no. 5, pp. 1084-1097, 2007.
[18] D. Trégouët et al., “Genome-Wide Haplotype Association Study Identifies the SLC22A3-LPAL2-LPA Gene Cluster as a Risk Locus for Coronary Artery Disease,” Nature Genetics, vol. 41, no. 3, pp. 283-285, 2009.
[19] The Wellcome Trust Case Control Consortium, “Genome-Wide Association Study of 14,000 Cases of Seven Common Diseases and 3,000 Shared Controls,” Nature, vol. 447, no. 7145, pp. 661-678, 2007.
[20] J. Morris and M. Gardner, “Statistics in Medicine: Calculating Confidence Intervals for Relative Risks (Odds Ratios) and Standardised Ratios and Rates,” British Medical J. (Clinical Research ed.), vol. 296, no. 6632, pp. 1313-1316, 1988.
[21] D. Siegmund and B. Yakir, The Statistics of Gene Mapping (Statistics for Biology and Health). Springer, 2007.
[22] L. Eronen, F. Geerts, and H. Toivonen, “HaploRec: Efficient and Accurate Large-Scale Reconstruction of Haplotypes,” BMC Bioinformatics, vol. 7, no. 1, article 542, 2006.
[23] T. O'Gorman, R. Woolson, M. Jones, and J. Lemke, “Statistical Analysis of K 2 x 2 Tables: A Comparative Study of Estimators/Test Statistics for Association and Homogeneity,” Environmental Health Perspectives, vol. 87, pp. 103-107, 1990.
[24] J. Li and T. Jiang, “Haplotype-Based Linkage Disequilibrium Mapping via Direct Data Mining,” Bioinformatics, vol. 21, no. 24, pp. 4384-4393, 2005.
[25] B. Efron, “Large-Scale Simultaneous Hypothesis Testing,” J. Am. Statistical Assoc., vol. 99, no. 465, pp. 96-104, 2004.
[26] Y. Benjamini and Y. Hochberg, “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” J. Royal Statistical Soc., Series B (Methodological), pp. 289-300, 1995.
[27] L. Lin, L. Wong, T. Leong, and P.S. Lai, “Efficient Mining of Haplotype Patterns for Linkage Disequilibrium Mapping,” J. Bioinformatics and Computational Biology, vol. 8, no. 1, pp. 127-146, 2010.
[28] S. Dalvie, N. Horn, C. Nossek, L. van der Merwe, D. Stein, and R. Ramesar, “Psychosis and Relapse in Bipolar Disorder are Related to GRM3, DAOA, and GRIN2B Genotype,” African J. Psychiatry, vol. 13, no. 4, pp. 297-301, 2010.
[29] M. Leost, C. Schultz, A. Link, Y. Wu, J. Biernat, E. Mandelkow, J. Bibb, G. Snyder, P. Greengard, D. Zaharevitz, R. Gussio, A. Senderowicz, E. Sausville, C. Kunick, and L. Meijer, “Paullones are Potent Inhibitors of Glycogen Synthase Kinase-3beta and Cyclin-Dependent Kinase 5/p25,” European J. Biochemistry, vol. 267, pp. 5983-5983, 2000.
[30] S. Kaladchibachi, B. Doble, N. Anthopoulos, J. Woodgett, and A. Manoukian, “Glycogen Synthase Kinase 3, Circadian Rhythms, and Bipolar Disorder: A Molecular Link in the Therapeutic Action of Lithium,” J. Circadian Rhythms, vol. 5, no. 1, article 3, 2007.
54 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool