Tulaya Limpiti , T. Limpiti is with Faculty of Engineering, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, 10520 Thailand.
Understanding genetic differences among populations is one of the most important issues in population genetics. Genetic variations, e.g., single nucleotide polymorphisms, are used to characterize commonality and difference of individuals from various populations. This paper presents an efficient graph-based clustering framework which operates iteratively on the Neighbor-Joining (NJ) tree called the iNJclust algorithm. The framework uses well-known genetic measurements, namely the allele-sharing distance, the neighbor-joining tree, and the fixation index. The behavior of the fixation index is utilized in the algorithm’s stopping criterion. The algorithm provides an estimated number of populations, individual assignments, and relationships between populations as outputs. The clustering result is reported in the form of a binary tree, whose terminal nodes represent the final inferred populations and the tree structure preserves the genetic relationships among them. The clustering performance and the robustness of the proposed algorithm are tested extensively using simulated and real datasets from bovine, sheep, and human populations. The result indicates that the number of populations within each dataset is reasonably estimated, the individual assignment is robust, and the structure of the inferred population tree corresponds to the intrinsic relationships among populations within the data.
Tulaya Limpiti, Chainarong Amornbunchornvej, Apichart Intarapanich, Anunchai Assawamakin, Sissades Tongsima, "iNJclust: Iterative Neighbor-Joining Tree Clustering Framework for Inferring Population Structure", IEEE/ACM Transactions on Computational Biology and Bioinformatics, , no. 1, pp. 1, PrePrints PrePrints, doi:10.1109/TCBB.2014.2322372