Issue No. 02 - April-June (2008 vol. 5)
A genetic map is an ordering of geneticmarkers calculated from a population of known lineage.While traditionally a map has been generated from a singlepopulation for each species, recently researchers have createdmaps from multiple populations. In the face of thesenew data, we address the need to find a consensus map — a map that combines the information from multiple partialand possibly inconsistent input maps. We model eachinput map as a partial order and formulate the consensusproblem as finding a median partial order. Finding themedian of multiple total orders (preferences or rankings)is a well studied problem in social choice. We choose tofind the median using the weighted symmetric differencedistance, a more general version of both the symmetricdifference distance and the Kemeny distance. Finding amedian order using this distance is NP-hard. We showthat for our chosen weight assignment, a median ordersatisfies the positive responsiveness, extended Condorcet,and unanimity criteria. Our solution involves finding themaximum acyclic subgraph of a weighted directed graph.We present a method that dynamically switches betweenan exact branch and bound algorithm and a heuristicalgorithm, and show that for real data from closely relatedorganisms, an exact median can often be found.We presentexperimental results using seven populations of the cropplant Zea mays.
Genetic map, median order, path and circuit problems, Kemeny distance, symmetric difference distance.
Benjamin N. Jackson, Srinivas Aluru, Patrick S. Schnable, "Consensus Genetic Maps as Median Orders from Inconsistent Sources", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 5, no. , pp. 161-171, April-June 2008, doi:10.1109/TCBB.2007.70221