The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - March/April (2012 vol.9)
pp: 499-510
Wenji Ma , Dept. of Comput. Sci., City Univ. of Hong Kong, Hong Kong, China
Yong Yang , Dept. of Comput. Sci., City Univ. of Hong Kong, Hong Kong, China
Zhi-Zhong Chen , Dept. of Inf. Syst. Design, Tokyo Denki Univ., Tokyo, Japan
Lusheng Wang , Dept. of Comput. Sci., City Univ. of Hong Kong, Hong Kong, China
ABSTRACT
Linkage analysis serves as a way of finding locations of genes that cause genetic diseases. Linkage studies have facilitated the identification of several hundreds of human genes that can harbor mutations which by themselves lead to a disease phenotype. The fundamental problem in linkage analysis is to identify regions whose allele is shared by all or almost all affected members but by none or few unaffected members. Almost all the existing methods for linkage analysis are for families with clearly given pedigrees. Little work has been done for the case where the sampled individuals are closely related, but their pedigree is not known. This situation occurs very often when the individuals share a common ancestor at least six generations ago. Solving this case will tremendously extend the use of linkage analysis for finding genes that cause genetic diseases. In this paper, we propose a mathematical model (the shared center problem) for inferring the allele-sharing status of a given set of individuals using a database of confirmed haplotypes as reference. We show the NP-completeness of the shared center problem and present a ratio-2 polynomial-time approximation algorithm for its minimization version (called the closest shared center problem). We then convert the approximation algorithm into a heuristic algorithm for the shared center problem. Based on this heuristic, we finally design a heuristic algorithm for mutation region detection. We further implement the algorithms to obtain a software package. Our experimental data show that the software is both fast and accurate. The package is available at >;http://www.cs.cityu.edu.hk/~lwang/software/LDWP/ for noncommercial use.
INDEX TERMS
Algorithm design and analysis, Couplings, Approximation algorithms, Software algorithms, Biological cells, Heuristic algorithms, Inference algorithms,and approximation algorithm., Haplotype inference, linkage analysis, pedigree, allele-sharing status
CITATION
Wenji Ma, Yong Yang, Zhi-Zhong Chen, Lusheng Wang, "Mutation Region Detection for Closely Related Individuals without a Known Pedigree", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 2, pp. 499-510, March/April 2012, doi:10.1109/TCBB.2011.134
REFERENCES
[1] G. Abecasis, S. Cherny, W. Cookson, and L. Cardon, “Merlin-Rapid Analysis of Dense Genetic Maps Using Sparse Gene Flow Trees,” Nature Genetics, vol. 30, pp. 97-101, 2002.
[2] K.W. Broman and J. Weber, “Characterization of Human Crossover Interference,” Am. J. Human Genetics, vol. 66, pp. 1911-1926, 2000.
[3] Z. Cai, H. Sabaa, Y. Wang, R. Goebel, Z. Wang, J. Xu, P. Stothard, and G. Lin, “Most Parsimonious Haplotype Allele Sharing Determination,” BMC Bioinformatics, vol. 10, article 115, 2009.
[4] K. Doi, J. Li, and T. Jiang, “Minimum Recombinant Haplotype Configuration on Tree Pedigrees,” Proc. Workshop Algorithms in Bioinformatics (WABI), pp. 339-353, 2003.
[5] R.C. Elston and J. Stewart, “A General Model for the Analysis of Pedigree Data,” Human Heredity, vol. 21, pp. 523-542, 1971.
[6] M. Frances and A. Litman, “On Covering Problems of Codes,” Theory of Computing Systems, vol. 30, pp. 113-119, 1997.
[7] D.F. Gudbjartsson, K. Jonasson, M.L. Frigge, and A. Kong, “Allegro, a New Computer Program for Multipoint Linkage Analysis,” Nature Genetics, vol. 25, pp. 12-13, 2000.
[8] L. Kruglyak, M.J. Daly, M.P. Reeve-Daly, and E.S. Lander, “Parametric and Nonparametric Linkage Analysis: A Unified Multipoint Approach,” Am. J. Human Genetics, vol. 58, pp. 1347-1363, 1995.
[9] E. Lander and P. Green, “Construction of Multilocus Genetic Linkage Maps in Human,” Proc. Nat'l Academy of Sciences USA, vol. 84, pp. 2363-2367, 1987.
[10] G.M. Lathrop, J.M. Lalouel, C. Julier, and J. Ott, “Strategies for Multilocus Linkage Analysis in Humans,” Proc. Nat'l Academy of Sciences USA, vol. 81, pp. 3443-3446, 1984.
[11] I. Leykin, K. Hao, J. Cheng, N. Meyer, M.R. Pollak, R.J.H. Smith, W.H. Wong, C. Rosenow, and C. Li, “Comparative Linkage Analysis and Visualization of High-Density Oligonucleotide SNP Array Data,” BMC Genetics, vol. 6, article 115, 2005.
[12] J. Li and T. Jiang, “Computing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming,” J. Computational Biology, vol. 12, no. 6, pp. 719-739, 2005.
[13] J. Li and T. Jiang, “An Exact Solution for Finding Minimum Recombinant Haplotype Configurations on Pedigrees with Missing Data by Integer Linear Programming,” Proc. Symp. Computational Molecular Biology (RECOMB), pp. 20-29, 2004.
[14] G. Lin, Z. Wang, L. Wang, Y.-L. Lau, W. Yang, “Identification of Linked Regions Using High-Density SNP Genotype Data in Linkage Analysis,” Bioinformatics, vol. 12, no. 6, pp. 86-93, 2008.
[15] D. Qian and L. Beckmann, “Minimum Recombinant Haplotyping in Pedigrees,” Am. J. Human Genetics, vol. 70, pp. 1434-1445, 2002.
[16] G. Sellick, C. Longman, J. Tolmie, R. Newbury-Ecob, L. Geenhalgh, S. Hughes, M. Whiteford, C. Carrett, and R. Houlston, “Genomewide Linkage Searches for Mendelian Disease Loci Can Be Efficiently Conducted Using High-Density snp Genotyping Arrays,” Nucleic Acids Research, vol. 12, no. 6, pp. e164, 2004.
[17] P. Tapadar, S. Ghosh, and P.P. Majumder, “Haplotyping in Pedigrees via a Genetic Algorithm,” Human Heredity, vol. 50, pp. 43-56, 2000.
[18] L. Wang, Z. Wang, and W. Yang, “Linked Region Detection Using High-Density SNP Genotype Data via the Minimum Recombinant Model of Pedigree Haplotype Inference,” BMC Bioinformatics, vol. 10, article 115, 2009.
[19] J. Xiao, L. Liu, L. Xia, and T. Jiang, “Fast Elimination of Redundant Linear Equations and Reconstruction of Recombination-Free Mendelian Inheritance on a Pedigree,” Proc. ACM-SIAM Symp. Discrete Algorithms (SODA), pp. 655-664, 2007.
[20] W. Yang, Z. Wang, L. Wang, P.-C. Sham, P. Huang, and Y.L. Lau, “Predicting the Number and Sizes of IBD Regions among Family Members and Evaluating the Family Size Requirement for Linkage Studies,” European J. Human Genetics, vol. 12, no. 6, pp. 1535-1543, 2008.
[21] K. Zhang, F. Sun, and H. Zhao, “Haplore: A Program for Haplotype Reconstruction in General Pedigrees without Recombination,” Bioinformatics, vol. 21, pp. 90-103, 2005.
29 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool