Subscribe
Issue No.03 - July-September (2008 vol.5)
pp: 448-460
ABSTRACT
The aim of genetic mapping is to locate the loci responsible for specific traits such as complex diseases. These traits are normally caused by mutations at multiple loci of unknown locations and interactions. In this work, we model the biological system that relates DNA polymorphisms with complex traits as a linear mixing process. Given this model, we propose a new fine-scale genetic mapping method based on independent component analysis. The proposed method outputs both independent associated groups of SNPs in addition to specific associated SNPs with the phenotype. It is applied to a clinical data set for the Schizophrenia disease with 368 individuals and 42 SNPs. It is also applied to a simulation study to investigate in more depth its performance. The obtained results demonstrate the novel characteristics of the proposed method compared to other genetic mapping methods. Finally, we study the robustness of the proposed method with missing genotype values and limited sample sizes.
INDEX TERMS
Independent component analysis (ICA), principal component analysis (PCA), single nucleotide polymorphisms (SNPs), linkage disequilibrium, complex diseases, association mapping
CITATION
Zaher Dawy, Michel Sarkis, Joachim Hagenauer, Jakob C. Mueller, "Fine-Scale Genetic Mapping Using Independent Component Analysis", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.5, no. 3, pp. 448-460, July-September 2008, doi:10.1109/TCBB.2007.1072
REFERENCES
 [1] H. Cordell and D. Clayton, “A Unified Stepwise Regression Procedure for Evaluating the Relative Effects of Polymorphisms within a Gene Using Case/Control or Family Data: Application to HLA in Type 1 Diabetes,” Am. J. Human Genetics, vol. 70, no. 1, pp.124-141, Jan. 2002. [2] R. Zee, J. Hoh, S. Cheng, R. Reynolds, M. Grow, A. Silbergleit, K. Walker, L. Steiner, G. Zangenberg, A. Fernandez-Ortiz, C. Macaya, E. Pintor, A. Fernandez-Cruz, J. Ott, and K. Lindpaintner, “Multi-Locus Interactions Predict Risk for Post-PTCA Restenosis: An Approach to the Genetic Analysis of Common Complex Disease,” The Pharmacogenomics J., vol. 2, pp. 197-201, 2002. [3] J.C. Mueller, J. Fuchs, A. Hofer, A. Zimprich, P. Lichtner, T. Illig, D. Berg, U. Wuellner, T. Meitinger, and T. Gasser, “Multiple Regions of $\alpha\hbox{-}{\rm Synuclein}$ Are Associated with Parkinson's Disease,” Annals of Neurology, vol. 57, pp. 535-541, Apr. 2005. [4] L.R. Cardon and J.I. Bell, “Association Study Designs for Complex Diseases,” Nature Rev. Genetics, vol. 2, pp. 91-99, 2001. [5] J. Hoh and J. Ott, “Mathematical Multi-Locus Approaches to Localizing Complex Human Trait Genes,” Nature Rev. Genetics, vol. 4, pp. 701-709, 2003. [6] D.J. Balding, M. Bishop, and C. Cannings, Handbook of Statistical Genetics. John Wiley & Sons, 2001. [7] M.S. McPeek and A. Strahs, “Assessment of Linkage Disequilibrium by the Decay of Haplotype Sharing, with Application to Fine-Scale Genetic Mapping,” Am. J. Human Genetics, vol. 65, pp.858-875, 1999. [8] A.P. Morris, J.C. Whittaker, and D.J. Balding, “Bayesian Fine-Scale Mapping of Disease Loci by Hidden Markov Models,” Am. J.Human Genetics, vol. 67, pp. 155-169, 2000. [9] Z. Dawy, B. Goebel, J. Hagenauer, C. Andreoli, T. Meitinger, and J. Mueller, “Gene Mapping and Marker Clustering Using Shannon's Mutual Information,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 1, pp. 47-56, Jan.-Mar. 2006. [10] D. Zaykin, P. Westfall, S. Young, M. Karnoub, M. Wagner, and M. Ehm, “Testing Association of Statistically Inferred Haplotypes with Discrete and Continuous Traits in Samples of Unrelated Individuals,” Human Heredity, vol. 53, no. 2, pp. 79-91, May 2002. [11] J. Hoh, A. Wille, and J. Ott, “Trimming, Weighting, and Grouping SNPs in Human Case-Control Association Studies,” Genome Research, vol. 11, pp. 2115-2119, 2001. [12] D.J. Schaid, C.M. Rowland, D.E. Tines, R.M. Jacobson, and G.A. Poland, “Score Tests for Association between Traits and Haplotypes When Linkage Phase Is Ambiguous,” Am. J. Human Genetics, vol. 70, pp. 425-434, 2002. [13] M.R. Nelson, S.L.R. Kardia, R.E. Ferrell, and C.F. Sing, “A Combinatorial Partitioning Method to Identify Multilocus Genotype Partitions that Predict Quantitative Trait Variation,” Genome Research, vol. 11, pp. 458-470, Mar. 2001. [14] C. Kooperberg and I. Ruczinski, “Identifying Interacting SNPs Using Monte Carlo Logic Regression,” Genetic Epidemiology, vol. 28, pp. 157-170, Feb. 2005. [15] O. Alter, P. Brown, and D. Botstein, “Generalized Singular Value Decomposition for Comparative Analysis of Genome-Scale Expression Data Sets of Two Different Organisms,” Proc. Nat'l Academy of Sciences, vol. 100, no. 6, pp. 3351-3356, Mar. 2003. [16] B.D. Horne and N.J. Camp, “Principal Component Analysis for Selection of Optimal SNP-Sets That Capture Intragenic Genetic Variation,” Genetic Epidemiology, vol. 26, pp. 11-21, Jan. 2004. [17] W. Liebermeister, “Linear Modes of Gene Expression Determined by Independent Component Analysis,” Bioinformatics, vol. 18, no. 1, pp. 51-60, Feb. 2002. [18] A. Hyvaerinen, J. Karhunen, and E. Oja, Independent Component Analysis. Wiley, 2001. [19] P. Comon, “Independent Component Analysis: A New Concept,” Elsevier Signal Processing, vol. 36, no. 3, pp. 287-314, Apr. 1994. [20] O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and R.B. Altman, “Missing Value Estimation Methods for DNA Microarrays,” Bioinformatics, vol. 17, no. 6, pp. 520-525, June 2001. [21] L. Liu, D. Hawkins, S. Gosh, and S. Young, “Robust Singular Value Decomposition Analysis of Microarray Data,” Proc. Nat'l Academy of Sciences, vol. 100, no. 23, pp. 13167-13172, Nov. 2003. [22] K. Gabriel and S. Zamir, “Lower Rank Approximation of Matrices by Least Squares with Any Choice of Weights,” Technometrics, vol. 21, no. 4, pp. 489-498, Nov. 1979. [23] S. Oba, M. Sato, I. Takemasa, M. Monden, K. Matsubara, and S. Ishii, “A Bayesian Missing Value Estimation Method for Gene Expression Profile Data,” Bioinformatics, vol. 19, no. 16, pp. 2088-2096, Nov. 2003. [24] H.F. Kaiser, “The Varimax Criterion for Analytic Rotation in Factor Analysis,” Psychometrika, vol. 23, pp. 187-200, 1958. [25] M. Sarkis, Z. Dawy, F. Obermeier, and K. Diepold, “Automatic Model-Order Selection for PCA,” Proc. IEEE Int'l Conf. Image Processing (ICIP '06), Oct. 2006. [26] A. Machado, J. Gee, and M. Campos, “Visual Data Mining for Modeling Prior Distributions in Morphometry,” IEEE Signal Processing Magazine, vol. 21, no. 3, pp. 20-27, May 2004. [27] G.A. Churchill and R.W. Doerge, “Empirical Threshold Values for Quantitative Trait Mapping,” Genetics, vol. 138, no. 3, pp. 963-971, Nov. 1994. [28] R. Hudson, “Generating Samples under a Wright-Fisher Neutral Model of Genetic Variation,” Bioinformatics, vol. 18, pp. 337-338, Feb. 2002. [29] M. Nothnagel, “Simulation of LD Block-Structured SNP Haplotype Data and Its Use for the Analysis of Case-Control Data by Supervised Learning Methods,” Am. J. Human Genetics, vol. 71, no. A2363, Oct. 2002. [30] D. Collett, Modelling Binary Data, second ed. Chapman and Hall/CRC Press, 2002. [31] S. Shwartz, M. Zibulevsky, and Y.Y. Schechner, “ICA Using Kernel Entropy Estimation with NlogN Complexity,” Proc. Fifth Int'l Conf. Independent Component Analysis (ICA '04), Sept. 2004. [32] A. Hyvaerinen, “Fast and Robust Fixed-Point Algorithms for Independent Component Analysis,” IEEE Trans. Neural Networks, vol. 10, no. 3, pp. 626-634, May 1999. [33] J.M. Akey, K. Zhang, M. Xiong, P. Doris, and L. Jin, “The Effect that Genotyping Errors Have on the Robustness of Common Linkage-Disequilibrium Measures,” Am. J. Human Genetics, vol. 68, pp. 1447-1456, 2001.