2008 International Conference on BioMedical Engineering and Informatics
Conditional LZ Complexity of DNA Sequences Analysis and its Application in Phylogenetic Tree Reconstruction
May 27-May 30
ISBN: 978-0-7695-3118-2
A DNA sequence can be identified with a word over analphabet N =A, C, G, T. Characteristic sequences of a DNA sequence are given in term of classifications of bases of nucleic acids. Here we propose a new measure for the similarity analysis of DNA sequences. It is based on conditional LZ complexity and (0,1) characteristic sequences of DNA primary sequences. This measure enables biologists to extract similarity information from biological sequences according to their requirements. For example, by this measure, one can obtain either the full similarity information or a similarity analysis from a given biological aspect. Moreover,the length of DNA primary sequence is not problematic. This new measure has been applied to phylogenetic tree construction, Based on conditional LZ complexity distance matrix. The application of the measure to the phylogenetic tree construction of 22 species shows its flexibility.
Citation:
Jingjun Liu, Dachao Li, "Conditional LZ Complexity of DNA Sequences Analysis and its Application in Phylogenetic Tree Reconstruction," bmei, vol. 1, pp.111-116, 2008 International Conference on BioMedical Engineering and Informatics, 2008