Issue No. 02 - April-June (2006 vol. 3)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2006.28
We describe TreeDT, a novel association-based gene mapping method. Given a set of disease-associated haplotypes and a set of control haplotypes, TreeDT predicts likely locations of a disease susceptibility gene. TreeDT extracts, essentially in the form of haplotype trees, information about historical recombinations in the population: A haplotype tree constructed at a given chromosomal location is an estimate of the genealogy of the haplotypes. TreeDT constructs these trees for all locations on the given haplotypes and performs a novel disequilibrium test on each tree: Is there a small set of subtrees with relatively high proportions of disease-associated chromosomes, suggesting shared genetic history for those and a likely disease gene location? We give a detailed description of TreeDT and the tree disequilibrium tests, we analyze the algorithm formally, and we evaluate its performance experimentally on both simulated and real data sets. Experimental results demonstrate that TreeDT has high accuracy on difficult mapping tasks and comparisons to other methods (EATDT, HPM, TDT) show that TreeDT is very competitive.
Biology and genetics, nonparametric statistics, nonnumerical algorithms and problems.
V. Ollikainen, P. Sevon and H. Toivonen, "TreeDT: Tree Pattern Mining for Gene Mapping," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 3, no. , pp. 174-185, 2006.