CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2007 vol.4 Issue No.01 - January-March

Subscribe

Issue No.01 - January-March (2007 vol.4)

pp: 108-116

ABSTRACT

We study distorted metrics on binary trees in the context of phylogenetic reconstruction. Given a binary tree <it>T</it> on <it>n</it> leaves with a path metric <it>d</it>, consider the pairwise distances <it>{d(u,v)}</it> between leaves. It is well known that these determine the tree and the <it>d</it> length of all edges. Here, we consider distortions <tmath><art file="_n0108m20070006.gif"/>$\hat{d}$</tmath> of <it>d</it> such that, for all leaves <it>u</it> and <it>v</it>, it holds that <tmath><art file="_n0108m20070010.gif"/>$|d(u,v) - \hat{d}(u,v)|< f/2$</tmath> if either <it>d(u,v)< M + f/2</it> or <tmath><art file="_n0108m20070012.gif"/>$\hat{d}(u,v)< M + f/2$</tmath>, where <it>d</it> satisfies <it>f ≤ d(e) ≤ g</it> for all edges <it>e</it>. Given such distortions, we show how to reconstruct in polynomial time a forest <it>T<sub>1</sub>, ... ,T<sub>α</sub></it> such that the true tree <it>T</it> may be obtained from that forest by adding <it>α-1</it> edges and <it>α-1 ≤ 2<super>-Ω(M/g)</super> n</it>. Our distorted metric result implies a reconstruction algorithm of phylogenetic forests with a small number of trees from sequences of length logarithmic in the number of species. The reconstruction algorithm is applicable for the general Markov model. Both the distorted metric result and its applications to phylogeny are almost tight.

INDEX TERMS

Phylogenetics, tree, forest, CFN, Jukes-Cantor, metric, distortion.

CITATION

Elchanan Mossel, "Distorted Metrics on Trees and Phylogenetic Forests",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol.4, no. 1, pp. 108-116, January-March 2007, doi:10.1109/TCBB.2007.1010