Subscribe
Issue No.02 - April-June (2008 vol.5)
pp: 245-251
ABSTRACT
We consider a combinatorial problem derived from haplotyping a population with respect to a genetic disease, either recessive or dominant. Given a set of individuals, partitioned into healthy and diseased, and the corresponding sets of genotypes, we want to infer bad'' and good'' haplotypes to account for these genotypes and for the disease. Assume e.g. the disease is recessive. Then, the resolving haplotypes must consist of \emph{bad} and \emph{good} haplotypes, so that (i) each genotype belonging to a diseased individual is explained by a pair of bad haplotypes and (ii) each genotype belonging to a healthy individual is explained by a pair of haplotypes of which at least one is good. We prove that the associated decision problem is NP-complete. However, we also prove that there is a simple solution, provided the data satisfy a very weak requirement.
INDEX TERMS
Combinatorics, Discrete Mathematics, Biology and genetics
CITATION
R. Ravi, Romeo Rizzi, "Haplotyping for Disease Association: A Combinatorial Approach", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.5, no. 2, pp. 245-251, April-June 2008, doi:10.1109/TCBB.2007.70255
REFERENCES
 [1] V. Bafna, D. Gusfield, G. Lancia, and S. Yooseph, “Haplotyping as Perfect Phylogeny: A Direct Approach,” J. Computational Biology, vol. 10, nos. 3-4, pp. 323-340, 2003. [2] D.G. Brown and J.M. Harrower, “A New Integer Programming Formulation for the Pure Parsimony Problem in Haplotype Analysis,” Proc. Fourth Int'l Workshop Algorithms in Bioinformatics, pp. 254-265, 2004. [3] A. Clark, “Inference of Haplotypes from PCR-Amplified Samples of Diploid Populations,” Molecular Biology Evolution, vol. 7, pp.111-122, 1990. [4] S.A. Cook, “The Complexity of Theorem-Proving Procedures,” Proc. Third Ann. ACM Symp. Theory of Computing, 1971. [5] Z. Ding, V. Filkov, and D. Gusfield, “A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping Problem,” Proc. Ninth Ann. Int'l Conf. Research in Computational Molecular Biology, 2005. [6] E. Eskin, E. Halperin, and R. Karp, “Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny,” J. Bioinformatics and Computational Biology, vol. 1, no. 1, pp. 1-20, 2003. [7] H. Greenberg, W. Hart, and G. Lancia, “Opportunities for Combinatorial Optimization in Computational Biology,” INFORMS J. Computing, vol. 16, no. 3, pp. 1-22, 2004. [8] D. Gusfield, “Haplotype Inference by Pure Parsimony,” Proc. 14th Ann. Symp. Combinatorial Pattern Matching, pp. 144-155, 2003. [9] D. Gusfield and S.H. Orzack, “Haplotype Inference,” Handbook of Computational Molecular Biology, pp. 1-28, Chapman and Hall/CRC Press, 2005. [10] L. Helmuth, “Genome Research: Map of the Human Genome 3.0,” Science, vol. 293, no. 5530, pp. 583-585, 2001. [11] Y.T. Huang, K.M. Chao, and T. Chen, “An Approximation Algorithm for Haplotype Inference by Maximum Parsimony,” Proc. 20th Ann. ACM Symp. Applied Computing, pp. 146-150, 2005. [12] G. Lancia, C. Pinotti, and R. Rizzi, “Haplotyping Populations by Pure Parsimony: Complexity, Exact and Approximation Algorithms,” INFORMS J. Computing, vol. 16, no. 4, pp. 17-29, 2004. [13] E. Marshall, “Drug Firms to Create Public Database of Genetic Mutations,” Science, vol. 284, no. 5413, pp. 406-407, 1999. [14] L. Wang and Y. Xu, “Haplotype Inference by Maximum Parsimony,” Bioinformatics, vol. 19, no. 14, pp. 1773-1780, 2003.