This Article 
 Bibliographic References 
 Add to: 
Pure Parsimony Xor Haplotyping
October-December 2010 (vol. 7 no. 4)
pp. 598-610
Paola Bonizzoni, Università Degli Studi di Milano-Bicocca, Milano
Gianluca Della Vedova, Università Degli Studi di Milano-Bicocca, Milano
Riccardo Dondi, Università degli Studi di Bergamo, Bergamo
Yuri Pirola, Università Degli Studi di Milano-Bicocca, Milano
Romeo Rizzi, Università degli Studi di Udine, Udine
The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies [1]. The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper, we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed-parameter algorithm for the general case. These results are based on some interesting combinatorial properties of a graph representation of the solutions. Furthermore, we show that the problem has a polynomial time k-approximation, where k is the maximum number of xor-genotypes containing a given single nucleotide polymorphisms (SNP). Finally, we propose a heuristic and produce an experimental analysis showing that it scales to real-world large instances taken from the HapMap project.

[1] T. Barzuza, J.S. Beckmann, R. Shamir, and I. Pe'er, "Computational Problems in Perfect Phylogeny Haplotyping: Xor-Genotypes and Tag SNPs," Proc. 15th Ann. Symp. Combinatorial Pattern Matching (CPM '04), pp. 14-31, http://springerlink.metapress. comopenurl.asp?genre=article&issn=0302-9743&volume= 3109&spage=14 , July 2004.
[2] W. Xiao and P.J. Oefner, "Denaturing High-Performance Liquid Chromatography: A Review," Human Mutation, vol. 17, no. 6, pp. 439-474, 2001.
[3] T. Barzuza, J.S. Beckmann, R. Shamir, and I. Pe'er, "Computational Problems in Perfect Phylogeny Haplotyping: Typing without Calling the Allele," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 5, no. 1, pp. 101-109, Jan.-Mar. 2008.
[4] D. Gusfield, "Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions," Proc. Sixth Ann. Conf. Research in Computational Molecular Biology (RECOMB), pp. 166-175, 2002,
[5] N. Patil, A.J. Berno, D.A. Hinds, W.A. Barrett, J.M. Doshi, C.R. Hacker, C.R. Kautzer, D.H. Lee, C. Marjoribanks, D.P. McDonough, B.T. Nguyen, M.C. Norris, J.B. Sheehan, N. Shen, D. Stern, R.P. Stokowski, D.J. Thomas, M.O. Trulson, K.R. Vyas, K.A. Frazer, S.P. Fodor, and D.R. Cox, "Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21," Science, vol. 294, no. 5547, pp. 1719-1723, , Nov. 2001.
[6] R. Sharan, B.V. Halldórsson, and S. Istrail, "Islands of Tractability for Parsimony Haplotyping," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 3, pp. 303-311, July-Sept. 2006.
[7] D. Gusfield, "Haplotype Inference by Pure Parsimony," Proc. 14th Symp. Combinatorial Pattern Matching (CPM), pp. 144-155, 2003,
[8] D.G. Brown and I.M. Harrower, "Integer Programming Approaches to Haplotype Inference by Pure Parsimony," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 3, no. 2, pp. 141-154, Apr. 2006.
[9] G. Lancia, M.C. Pinotti, and R. Rizzi, "Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms," INFORMS J. Computing, vol. 16, no. 4, pp. 348-359, 2004.
[10] L. van Iersel, J. Keijsper, S. Kelk, and L. Stougie, "Shorelines of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 5, no. 2, pp. 301-312, Apr.-June 2008.
[11] G. Lancia and R. Rizzi, "A Polynomial Case of the Parsimony Haplotyping Problem," Operations Research Letters, vol. 34, no. 3, pp. 289-295, 2006.
[12] R. Diestel, Graph Theory, third ed. vol. 173, Springer-Verlag, 2005.
[13] R. Downey and M. Fellows, Parameterized Complexity. Springer-Verlag, 1999.
[14] C. Savage, "A Survey of Combinatorial Gray Codes," SIAM Rev., vol. 39, no. 4, pp. 605-629, , 1997.
[15] J.R. Bitner, G. Ehrlich, and E.M. Reingold, "Efficient Generation of the Binary Reflected Gray Code and Its Applications," Comm. ACM, vol. 19, no. 9, pp. 517-521,, 1976.
[16] E. Fredkin, "Trie Memory," Comm. ACM, vol. 3, no. 9, pp. 490-499,, 1960.
[17] W.T. Tutte, "An Algorithm for Determining whether a Given Binary Matroid Is Graphic," Proc. Am. Math. Soc., vol. 11, no. 6, pp. 905-917, 1960.
[18] R.E. Bixby and D.K. Wagner, "An Almost Linear-Time Algorithm for Graph Realization," Math. of Operations Research, vol. 13, pp. 99-123, 1988.
[19] S. Fujishige, "An Efficient PQ-Graph Algorithm for Solving the Graph Realization Problem," J. Computer and System Science, vol. 21, pp. 63-68, 1980.
[20] T. Barzuza, GREAL—Software for the Graph Realization Problem,, 2010.
[21] F. Gavril and R. Tamari, "An Algorithm for Constructing Edge-Trees from Hypergraphs," Networks, vol. 13, no. 3, pp. 377-388,, 1983.
[22] R.R. Hudson, "Generating Samples under a Wright-Fisher Neutral Model of Genetic Variation," Bioinformatics, vol. 18, no. 2, pp. 337-338, 18.2.337, Feb. 2002.
[23] GLPK—the GNU Linear Programming Kit, http://www.gnu. org/softwareglpk/, 2010.
[24] The International HapMap Consortium, "A Haplotype Map of the Human Genome," Nature, vol. 437, no. 7063,pp. 1299-1320,, 2005.

Index Terms:
Algorithms, haplotype resolution, pure parsimony, approximation algorithms, graph representation.
Paola Bonizzoni, Gianluca Della Vedova, Riccardo Dondi, Yuri Pirola, Romeo Rizzi, "Pure Parsimony Xor Haplotyping," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 4, pp. 598-610, Oct.-Dec. 2010, doi:10.1109/TCBB.2010.52
Usage of this product signifies your acceptance of the Terms of Use.