The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - November/December (2011 vol.8)
pp: 1692-1699
Tzvika Hartman , Google, Tel Aviv
Danny Hermelin , Max Planck Institute for Informatics, Saarbrucken
Gad M. Landau , University of Haifa, Haifa
Frances Rosamond , The University of Newcastle, Newcastle
Liat Rozenberg , University of Haifa, Haifa
ABSTRACT
The haplotype inference problem (HIP) asks to find a set of haplotypes which resolve a given set of genotypes. This problem is important in practical fields such as the investigation of diseases or other types of genetic mutations. In order to find the haplotypes which are as close as possible to the real set of haplotypes that comprise the genotypes, two models have been suggested which are by now well-studied: The perfect phylogeny model and the pure parsimony model. All known algorithms up till now for haplotype inference may find haplotypes that are not necessarily plausible, i.e., very rare haplotypes or haplotypes that were never observed in the population. In order to overcome this disadvantage, we study in this paper, a new constrained version of HIP under the above-mentioned models. In this new version, a pool of plausible haplotypes \widetilde{H} is given together with the set of genotypes G, and the goal is to find a subset H \subseteq \widetilde{H} that resolves G. For constrained perfect phylogeny haplotyping (CPPH), we provide initial insights and polynomial-time algorithms for some restricted cases of the problem. For constrained parsimony haplotyping (CPH), we show that the problem is fixed parameter tractable when parameterized by the size of the solution set of haplotypes.
INDEX TERMS
Haplotyping, perfect phylogeny, pure parsimony, polynomial-time algorithms, parameterized complexity.
CITATION
Tzvika Hartman, Danny Hermelin, Gad M. Landau, Frances Rosamond, Liat Rozenberg, "Haplotype Inference Constrained by Plausible Haplotype Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.8, no. 6, pp. 1692-1699, November/December 2011, doi:10.1109/TCBB.2010.72
REFERENCES
[1] “The International HapMap Project,” Nature, vol. 426, pp. 789-796, 2003.
[2] B. Aspvall, M.F. Plass, and R.E. Tarjan, “A Linear-Time Algorithm for Testing the Truth of Certain Quantified Boolean Formulas,” Information Processing Letters, vol. 8, no. 3, pp. 121-123, 1979.
[3] V. Bafna, D. Gusfield, S. Hannenhalli, and S. Yooseph, “A Note on Efficient Computation of Haplotypes via Perfect Phylogeny,” J. Computational Biology, vol. 11, pp. 858-866, 2004.
[4] V. Bafna, D. Gusfield, G. Lancia, and S. Yooseph, “Haplotyping as Perfect Phylogeny: A Direct Approach,” J. Computational Biology, vol. 10, pp. 323-340, 2003.
[5] T. Barzuza, J.S. Beckmann, R. Shamir, and I. Peer, “Computational Problems in Perfect Phylogeny Haplotyping: XOR-Genotypes and Tag SNPs,” Proc. 15th Ann. Symp. Combinatorial Pattern Matching (CPM), pp. 14-31, 2004.
[6] D. Brown and I.M. Harrower, “A New Integer Programming Formulation for the Pure Parsimony Problem in Haplotype Analysis,” Proc. Int'l Workshop Algorithms in Bioinformatics (WABI), pp. 254-265, 2004.
[7] R. Cilibrasi, L. van Iersel, S. Kelk, and J. Tromp, “On the Complexity of Several Haplotyping Problems,” Proc. Int'l Workshop Algorithms in Bioinformatics (WABI), pp. 128-139, 2005.
[8] P. Damaschke, “Fast Perfect Phylogeny Haplotype Inference,” Proc. 14th Symp. Fundamentals of Computation Theory (FCT), pp. 183-194, 2003.
[9] Z. Ding, V. Filkov, and D. Gusfield, “A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping (PPH) Problem,” J. Computational Biology, vol. 13, pp. 522-553, 2006.
[10] R. Downey and M. Fellows, Parameterized Complexity. Springer-Verlag, 1999.
[11] M. Elberfeld and T. Tantau, “Phylogeny- and Parsimony-Based Haplotype Inference with Constraints,” Proc. 21st Ann. Symp. Combinatorial Pattern Matching (CPM), 2010.
[12] E. Eskin, E. Halperin, and R. Karp, “Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny,” J. Bioinformatics and Computational Biology, vol. 1, pp. 1-20, 2003.
[13] R. Fleischer, J. Guo, R. Niedermeier, J. Uhlmann, Y. Wang, M. Weller, and X. Wu, “Extended Islands of Tractability for Parsimony Haplotyping,” Proc. 21st Ann. Symp. Combinatorial Pattern Matching (CPM), 2010.
[14] J. Gramm, T. Nierhoff, R. Sharan, and T. Tantau, “On the Complexity of Haplotyping via Perfect Phylogeny,” Proc. RECOMB Satellite Workshop Computational Methods for SNPs and Haplotypes, 2004.
[15] J. Gramm, T. Nierhoff, R. Sharan, and T. Tantau, “Haplotyping with Missing Data via Perfect Path Phylogenies,” Discrete Applied Math., vol. 155, pp. 788-805, 2007.
[16] G. Greenspan and D. Geiger, “Model-Based Inference of Haplotype Block Variation,” Proc. Seventh Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '03), pp. 131-137, 2003.
[17] D. Gusfield, “Haplotyping As Perfect Phylogeny: Conceptual Framework and Efficient Solutions (Extended Abstract),” Proc. Sixth Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB), pp. 166-175, 2002.
[18] D. Gusfield, “Haplotype Inference by Pure Parsimony,” Proc. 14th Ann. Symp. Combinatorial Pattern Matching (CPM), pp. 144-155, 2003.
[19] D. Gusfield and S.H. Orzack, “Haplotype Inference,” Handbook of Computational Molecular Biology, S. Aluru, ed., Chapman Hall/CRC Press, 2006.
[20] D. Gusfield, Y. Song, and Y. Wu, “Algorithms for Imperfect Phylogeny Haplotyping with a Single Homoplasy or Recombination Event,” Proc. Int'l Workshop Algorithms in Bioinformatics (WABI), pp. 152-164, 2005.
[21] B. Halldórsson, V. Bafna, N. Edwards, R. Lippert, S. Yooseph, and S. Istrail, “A Survey of Computational Methods for Determining Haplotypes,” Proc. RECOMB Satellite Workshop Computational Methods for SNPs and Haplotype Inference, pp. 26-47, 2003.
[22] E. Halperin and E. Eskin, “Haplotype Reconstruction from Genotype Data Using Imperfect Phylogeny,” Bioinformatics, vol. 20, pp. 1842-1849, 2004.
[23] E. Halperin and R.M. Karp, “Perfect Phylogeny and Haplotype Assignment,” Proc. Eighth Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB), pp. 10-19, 2004.
[24] R. Hudson, “Gene Genealogies and the Coalescent Process,” Oxford Survey of Evolutionary Biology, vol. 7, pp. 1-44, 1990.
[25] L. Van Iersel, J. Keijsper, S. Kelk, and L. Stougie, “Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems,” Proc. Int'l Workshop Algorithms in Bioinformatics (WABI), pp. 80-91, 2006.
[26] G. Kimmel and R. Shamir, “The Incomplete Perfect Phylogeny Haplotype Problem,” J. Bioinformatics and Computational Biology, vol. 3, pp. 359-384, 2005.
[27] G. Lancia, C. Pinotti, and R. Rizzi, “Haplotyping Population by Pure Parsimony: Complexity, Exact and Approximation Algorithms,” INFORMS J. Computing, Special Issue on Computational Biology, vol. 16, pp. 348-359, 2004.
[28] G. Lancia and R. Rizzi, “A Polynomial Case of the Parsimony Haplotyping Problem,” Operations Research Letters, vol. 34, pp. 289-295, 2006.
[29] P. Rastas, M. Koivisto, H. Mannila, and E. Ukkonnen, “A Hidden Markov Technique for Haplotype Reconstruction,” Proc. Int'l Workshop Algorithms in Bioinformatics (WABI), pp. 140-151, 2005.
[30] R.V. Satya and A. Mukherjee, “An Optimal Algorithm for Perfect Phylogeny Haplotyping,” J. Computational Biology, vol. 13, no. 4, pp. 897-928, 2006.
[31] R. Sharan, B. Halldorsson, and S. Istrail, “Islands of Tractability for Parsimony Haplotyping,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 3, pp. 303-311, July-Sept. 2006.
[32] S. Tavare, “Calibrating the Clock: Using Stochastic Process to Measure the Rate of Evolution,” Calculating the Secrets of Life, E. Lander and M. Waterman, eds., Nat'l Academy Press, 1995.
[33] L. Wang and L. Xu, “Haplotype Inference by Maximum Parsimony,” Bioinformatics, vol. 19, pp. 1773-1780, 2003.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool