This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Finding Consistent Gene Transmission Patterns on Large and Complex Pedigrees
July-September 2006 (vol. 3 no. 3)
pp. 252-262
A heuristic algorithm for finding gene transmission patterns on large and complex pedigrees with partially observed genotype data is proposed. The method can be used to generate an initial point for a Markov chain Monte Carlo simulation or to check that the given pedigree and the genotype data are consistent. In small pedigrees, the algorithm is exact by exhaustively enumerating all possibilities, but, in large pedigrees, with a considerable amount of unknown data, only a subset of promising configurations can actually be checked. For that purpose, the configurations are ordered by combining the approximative conditional probability distribution of the unknown genotypes with the information on the relationships between individuals. We also introduce a way to divide the task into subparts, which has been shown to be useful in large pedigrees. The algorithm has been implemented in a program called APE (Allelic Path Explorer) and tested in three different settings with good results.

[1] L. Aceto, J. Hansen, A. Ingólfsdóttir, J. Johnsen, and J. Knudsen, “The Complexity of Checking Consistency of Pedigree Information and Related Problems,” J. Computer Science and Technology, vol. 19, no. 1, pp. 42-59, 2004.
[2] R. Elston and J. Stewart, “A General Model for the Genetic Analysis of Pedigree Data,” Human Heredity, vol. 21, pp. 523-542, 1971.
[3] D. Gasbarra, M. Sillanpää, and E. Arjas, “Backward Simulation of Ancestors of Sampled Individuals,” Theoretical Population Biology, vol. 67, pp. 75-83, 2005.
[4] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, 1984.
[5] S.C. Heath, “Generating Consistent Genotypic Configurations for Multi-Allelic Loci and Large Complex Pedigrees,” Human Heredity, vol. 48, pp. 1-11, 1998.
[6] S.C. Heath, “Markov Chain Monte Carlo Segregation and Linkage Analysis for Oligogenic Models,” Am. J. Human Genetics, vol. 61, pp. 748-760, 1997.
[7] S.C. Heath, G.L. Snow, E.A. Thompson, C. Tseng, and E.M. Wijsman, “MCMC Segregation and Linkage Analysis,” Genetic Epidemiology, vol. 14, pp. 1011-1015, 1997.
[8] K. Lange, Mathematical and Statistical Methods for Genetic Analysis, second ed., pp. 82-83. New York: Springer-Verlag, 2002.
[9] K. Lange and T.M. Goradia, “An Algorithm for Automatic Genotype Elimination,” Am. J. Human Genetics, vol. 40, pp. 250-256, 1987.
[10] S. Lauritzen and N.A. Sheehan, “Graphical Models for Genetic Analyses,” Statistical Science, vol. 18, pp. 489-514, 2003.
[11] S.L. Lin, E.A. Thompson, and E. Wijsman, “Achieving Irreducibility of the Markov Chain Monte Carlo Method Applied to Pedigree Data,” IMA J. Math. Applied Medicine and Biology, vol. 10, pp. 1-17, 1993.
[12] Y. Luo and S. Lin, “Finding Starting Points for Markov Chain Monte Carlo Analysis of Genetic Data from Large and Complex Pedigrees,” Genetic Epidemiology, vol. 25, pp. 14-24, 2003.
[13] A. Mackworth, “Constraint Satisfaction,” Encyclopedia of Artificial Intelligence, vol. 1, S. Shapiro, ed., pp. 205-211, John Wiley & Sons, 1987.
[14] K. Marriot and P. Stuckey, Programming with Constraints: An Introduction, chapter I.3. The MIT Press, 1998.
[15] J. O'Connell and D. Weeks, “An Optimal Algorithm for Automatic Genotype Elimination,” Am. J. Human Genetics, vol. 65, pp. 1733-1740, 1999.
[16] J. O'Connell and D. Weeks, “PedCheck: A Program for Identification of Genotype Incompatibilities in Linkage Analysis,” Am. J. Human Genetics, vol. 63, pp. 259-266, 1998.
[17] N.A. Sheehan, “On the Application of Markov Chain Monte Carlo Methods to Genetic Analyses on Complex Pedigrees,” Int'l Statistical Rev., vol. 68, pp. 83-110, 2000.
[18] E.A. Thompson, “Statistical Inference from Genetic Data on Pedigrees,” NSF-CBMS Regional Conf. Series in Probability and Statistics, vol. 6, 2000.

Index Terms:
Backtracking, heuristic methods, constraint satisfaction, sorting and searching, biology and genetics, pedigree, consistent genotype configuration.
Citation:
Matti Pirinen, Dario Gasbarra, "Finding Consistent Gene Transmission Patterns on Large and Complex Pedigrees," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 3, no. 3, pp. 252-262, July-Sept. 2006, doi:10.1109/TCBB.2006.36
Usage of this product signifies your acceptance of the Terms of Use.