Issue No. 02 - March/April (2012 vol. 9)

ISSN: 1545-5963

pp: 517-534

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.128

S. Kelk , Dept. of Knowledge Eng. (DKE), Maastricht Univ., Maastricht, Netherlands

C. Scornavacca , Center for Bioinf. (ZBIT), Tubingen Univ., Tubingen, Germany

L. van Iersel , Dept. of Math. & Stat., Univ. of Canterbury, Christchurch, New Zealand

ABSTRACT

Rooted phylogenetic networks are often used to represent conflicting phylogenetic signals. Given a set of clusters, a network is said to represent these clusters in the softwiredsense if, for each cluster in the input set, at least one tree embedded in the network contains that cluster. Motivated by parsimony we might wish to construct such a network using as few reticulations as possible, or minimizing the level of the network, i.e., the maximum number of reticulations used in any "tangled" region of the network. Although these are NP-hard problems, here we prove that, for every fixed k ≥ 0, it is polynomial-time solvable to construct a phylogenetic network with level equal to k representing a cluster set, or to determine that no such network exists. However, this algorithm does not lend itself to a practical implementation. We also prove that the comparatively efficient CASS algorithm correctly solves this problem (and also minimizes the reticulation number) when input clusters are obtained from two not necessarily binary gene trees on the same set of taxa but does not always minimize level for general cluster sets. Finally, we describe a new algorithm which generates in polynomial-time all binary phylogenetic networks with exactly r reticulations representing a set of input clusters (for every fixed r ≥ 0).

INDEX TERMS

Phylogeny, Clustering algorithms, Vegetation, Polynomials, Generators, Binary trees, Minimization,polynomial-time algorithms., Rooted phylogenetic networks, clusters, reticulate evolution, parsimony, computational complexity

CITATION

S. Kelk, C. Scornavacca, L. van Iersel, "On the Elusiveness of Clusters",

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*, vol. 9, no. , pp. 517-534, March/April 2012, doi:10.1109/TCBB.2011.128