This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Efficient Genotype Elimination via Adaptive Allele Consolidation
July-Aug. 2012 (vol. 9 no. 4)
pp. 1180-1189
N. De Francesco, Dipt. di Ing. dell'Inf.: Elettron., Inf., Telecomun, Univ. di Pisa, Pisa, Italy
G. Lettieri, Dipt. di Ing. dell'Inf.: Elettron., Inf., Telecomun., Univ. di Pisa, Pisa, Italy
L. Martini, S.p.A., Pisa, Italy
We propose the technique of Adaptive Allele Consolidation, that greatly improves the performance of the Lange-Goradia algorithm for genotype elimination in pedigrees, while still producing equivalent output. Genotype elimination consists in removing from a pedigree those genotypes that are impossible according to the Mendelian law of inheritance. This is used to find errors in genetic data and is useful as a preprocessing step in other analyses (such as linkage analysis or haplotype imputation). The problem of genotype elimination is intrinsically combinatorial, and Allele Consolidation is an existing technique where several alleles are replaced by a single "lumped” allele in order to reduce the number of combinations of genotypes that have to be considered, possibly at the expense of precision. In existing Allele Consolidation techniques, alleles are lumped once and for all before performing genotype elimination. The idea of Adaptive Allele Consolidation is to dynamically change the set of alleles that are lumped together during the execution of the Lange-Goradia algorithm, so that both high performance and precision are achieved. We have implemented the technique in a tool called Celer and evaluated it on a large set of scenarios, with good results.

[1] K. Lange and T. Goradia, "An Algorithm for Automatic Genotype Elimination," Am. J. Human Genetics, vol. 40, pp. 250-256, 1987.
[2] L. Aceto, J.A. Hansen, A. Ingólfsdóttir, J. Johnsen, and J. Knudsen, "The Complexity of Checking Consistency of Pedigree Information and Related Problems," J. Computer Science and Technology, Special Issue on Bioinformatics, vol. 19, no. 1, pp. 42-59, 2004.
[3] J. Li and T. Jiang, "Efficient Rule-Based Haplotyping Algorithms for Pedigree Data," Proc. Seventh Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '03), pp. 197-206, 2003.
[4] J.R. O'Connell and D.E. Weeks, "An Optimal Algorithm for Automatic Genotype Elimination," Am. J. Human Genetics, vol. 65, no. 6, pp. 1733-1740, 1999.
[5] K. Lange, Applied Probability. Springer, 2003.
[6] J.R. O'Connell and D.E. Weeks, "The Vitesse Algorithm for Rapid Exact Multilocus Linkage Analysis via Genotype Set-Recoding and Fuzzy Inheritance," Nature Genetics, vol. 11, no. 4, pp. 402-408, 1995.
[7] J.R. O'Connell and D.E. Weeks, "PedCheck: A Program for Identification of Genotype Incompatibilities in Linkage Analysis," The Am. J. Human Genetics, vol. 63, no. 1, pp. 259-266, http://www.sciencedirect.com/science/article/ B8JDD-4R1WP1V-17/2b556d7a79c50d44c4f200e65a4eac506 , 1998.
[8] N.D. Francesco, G. Lettieri, and L. Martini, "Allele Consolidation via Abstract Interpretation," Technical Report IET-11-01, Dipartimento di Ingegneria dell'Informazione, Università di Pisa, http://www.ing.unipi.it/a080224report11-01.pdf , 2011.
[9] P. Cousot and R. Cousot, "Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints," Proc. Fourth Ann. ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages, pp. 238-252, 1977.
[10] N.D. Francesco, G. Lettieri, and L. Martini, "Celer: An Efficient Program for Genotype Elimination," Proc. AMCA-POP, vol. 33, pp. 56-70, 2010.
[11] M. Pirinen and D. Gasbarra, "Finding Consistent Gene Transmission Patterns on Large and Complex Pedigrees," IEEE/ACM Trans. Computational Biology Bioinformatics, vol. 3, no. 3, pp. 252-262, July-Sept. 2006.
[12] Y. Luo and S. Lin, "Finding Starting Points for Markov Chain Monte Carlo Analysis of Genetic Data from Large and Complex Pedigrees," Genetic Epidemiology, vol. 25, no. 1, pp. 14-24, 2003.
[13] D. Gasbarra, M.J. Sillanpää, and E. Arjas, "Backward Simulation of Ancestors of Sampled Individuals," Theoretical Population Biology, vol. 67, no. 2, pp. 75-83, http://www.sciencedirect.com/science/article/ B6WXD-4F6F67C-1/2134b19fb4e742340bb5b97813e0308b8 , 2005.
[14] M. Sargolzaei and F.S. Schenkel, "Qmsim: A Large-scale Genome Simulator for Livestock," Bioinformatics, vol. 25, no. 5, pp. 680-681, http://dx.doi.org/10.1093/bioinformatics btp045, Mar. 2009.

Index Terms:
molecular biophysics,bioinformatics,cellular biophysics,combinatorial mathematics,genetics,Celer tool,genotype elimination,adaptive allele consolidation,Lange-Goradia algorithm,pedigrees,Mendelian inheritance law,genetic data error,linkage analysis,haplotype imputation,combinatorial problem,Vectors,Heuristic algorithms,Genetics,Bioinformatics,Computational biology,Polynomials,Couplings,pedigree.,Genotype elimination,allele consolidation
Citation:
N. De Francesco, G. Lettieri, L. Martini, "Efficient Genotype Elimination via Adaptive Allele Consolidation," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1180-1189, July-Aug. 2012, doi:10.1109/TCBB.2012.46
Usage of this product signifies your acceptance of the Terms of Use.