Issue No. 03 - May-June (2012 vol. 9)

ISSN: 1545-5963

pp: 828-836

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2012.24

P. Jarvis , Sch. of Math. & Phys., Univ. of Tasmania, Hobart Tas, TAS, Australia

J. Sumner , Sch. of Math. & Phys., Univ. of Tasmania, Hobart Tas, TAS, Australia

ABSTRACT

We consider novel phylogenetic models with rate matrices that arise via the embedding of a progenitor model on a small number of character states, into a target model on a larger number of character states. Adapting representation-theoretic results from recent investigations of Markov invariants for the general rate matrix model, we give a prescription for identifying and counting Markov invariants for such "symmetric embedded” models, and we provide enumerations of these for the first few cases with a small number of character states. The simplest example is a target model on three states, constructed from a general 2 state model; the "2 \hookrightarrow 3” embedding. We show that for 2 taxa, there exist two invariants of quadratic degree that can be used to directly infer pairwise distances from observed sequences under this model. A simple simulation study verifies their theoretical expected values, and suggests that, given the appropriateness of the model class, they have superior statistical properties than the standard (log) Det invariant (which is of cubic degree for this case).

INDEX TERMS

statistical analysis, embedded systems, evolution (biological), genetics, Markov processes, M-theory, physiological models, standard det invariant, Markov invariants, phylogenetic rate matrices, progenitor model, general rate matrix model, symmetric embedded models, statistical properties, Markov processes, Phylogeny, Adaptation models, Polynomials, Tensile stress, Algebra, Biological system modeling, representation theory., Markov chains

CITATION

P. Jarvis and J. Sumner, "Markov Invariants for Phylogenetic Rate Matrices Derived from Embedded Submodels," in

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol. 9, no. , pp. 828-836, 2012.

doi:10.1109/TCBB.2012.24

CITATIONS