Issue No. 04 - October-December (2010 vol. 7)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2010.52
Gianluca Della Vedova , Università Degli Studi di Milano-Bicocca, Milano
Yuri Pirola , Università Degli Studi di Milano-Bicocca, Milano
Riccardo Dondi , Università degli Studi di Bergamo, Bergamo
Romeo Rizzi , Università degli Studi di Udine, Udine
Paola Bonizzoni , Università Degli Studi di Milano-Bicocca, Milano
The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies . The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper, we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed-parameter algorithm for the general case. These results are based on some interesting combinatorial properties of a graph representation of the solutions. Furthermore, we show that the problem has a polynomial time k-approximation, where k is the maximum number of xor-genotypes containing a given single nucleotide polymorphisms (SNP). Finally, we propose a heuristic and produce an experimental analysis showing that it scales to real-world large instances taken from the HapMap project.
Algorithms, haplotype resolution, pure parsimony, approximation algorithms, graph representation.
Gianluca Della Vedova, Yuri Pirola, Riccardo Dondi, Romeo Rizzi, Paola Bonizzoni, "Pure Parsimony Xor Haplotyping", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. , pp. 598-610, October-December 2010, doi:10.1109/TCBB.2010.52