A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure
Issue No. 04 - July/August (2011 vol. 8)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.23
Ming-Chi Tsai , Joint CMU-Pitt PhD Program in Computational Biology, Pittsburgh
Russell Schwartz , Carnegie Mellon University, Pittsburgh
Guy Blelloch , Carnegie Mellon University, Pittsburgh
R. Ravi , Carnegie-Mellon University, Pittsburgh
The random accumulation of variations in the human genome over time implicitly encodes a history of how human populations have arisen, dispersed, and intermixed since we emerged as a species. Reconstructing that history is a challenging computational and statistical problem but has important applications both to basic research and to the discovery of genotype-phenotype correlations. We present a novel approach to inferring human evolutionary history from genetic variation data. We use the idea of consensus trees, a technique generally used to reconcile species trees from divergent gene trees, adapting it to the problem of finding robust relationships within a set of intraspecies phylogenies derived from local regions of the genome. Validation on both simulated and real data shows the method to be effective in recapitulating known true structure of the data closely matching our best current understanding of human evolutionary history. Additional comparison with results of leading methods for the problem of population substructure assignment verifies that our method provides comparable accuracy in identifying meaningful population subgroups in addition to inferring relationships among them. The consensus tree approach thus provides a promising new model for the robust inference of substructure and ancestry from large-scale genetic variation data.
Biology and genetics, trees, information theory, graph algorithms.
Ming-Chi Tsai, Russell Schwartz, Guy Blelloch, R. Ravi, "A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. , pp. 918-928, July/August 2011, doi:10.1109/TCBB.2011.23