CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2014 vol.11 Issue No.05 - Sept.-Oct.
Issue No.05 - Sept.-Oct. (2014 vol.11)
Tulaya Limpiti , Faculty of Engineering, King Mongkut¿s Institute of Technology Ladkrabang (KMITL), Bangkok, , Thailand
Chainarong Amornbunchornvej , Faculty of Engineering, King Mongkut¿s Institute of Technology Ladkrabang (KMITL), Bangkok, , Thailand
Apichart Intarapanich , , National Electronics and Computer Technology Center (NECTEC), Klongluang, Pathumthani, Thailand
Anunchai Assawamakin , Department of Pharmacology, Faculty of Pharmacy, Mahidol University, Rajathevi, Bangkok, Thailand
Sissades Tongsima , , National Center for Genetic Engineering and Biotechnology (BIOTEC), Klongluang, Pathumthani, Thailand
Understanding genetic differences among populations is one of the most important issues in population genetics. Genetic variations, e.g., single nucleotide polymorphisms, are used to characterize commonality and difference of individuals from various populations. This paper presents an efficient graph-based clustering framework which operates iteratively on the Neighbor-Joining (NJ) tree called the
iNJclust algorithm. The framework uses well-known genetic measurements, namely the allele-sharing distance, the neighbor-joining tree, and the fixation index. The behavior of the fixation index is utilized in the algorithm’s stopping criterion. The algorithm provides an estimated number of populations, individual assignments, and relationships between populations as outputs. The clustering result is reported in the form of a binary tree, whose terminal nodes represent the final inferred populations and the tree structure preserves the genetic relationships among them. The clustering performance and the robustness of the proposed algorithm are tested extensively using simulated and real data sets from bovine, sheep, and human populations. The result indicates that the number of populations within each data set is reasonably estimated, the individual assignment is robust, and the structure of the inferred population tree corresponds to the intrinsic relationships among populations within the data.
Sociology, Statistics, Clustering algorithms, Genetics, Bioinformatics, Variable speed drives, IEEE transactions,population structure analysis, Allele-sharing distance, clustering, fixation index, neighbor-joining tree
Tulaya Limpiti, Chainarong Amornbunchornvej, Apichart Intarapanich, Anunchai Assawamakin, Sissades Tongsima, "iNJclust: Iterative Neighbor-Joining Tree Clustering Framework for Inferring Population Structure", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.11, no. 5, pp. 903-914, Sept.-Oct. 2014, doi:10.1109/TCBB.2014.2322372