Subscribe
Issue No.02 - April-June (2008 vol.5)
pp: 161-171
ABSTRACT
A genetic map is an ordering of geneticmarkers calculated from a population of known lineage.While traditionally a map has been generated from a singlepopulation for each species, recently researchers have createdmaps from multiple populations. In the face of thesenew data, we address the need to find a consensus map &#x2014; a map that combines the information from multiple partialand possibly inconsistent input maps. We model eachinput map as a partial order and formulate the consensusproblem as finding a median partial order. Finding themedian of multiple total orders (preferences or rankings)is a well studied problem in social choice. We choose tofind the median using the weighted symmetric differencedistance, a more general version of both the symmetricdifference distance and the Kemeny distance. Finding amedian order using this distance is NP-hard. We showthat for our chosen weight assignment, a median ordersatisfies the positive responsiveness, extended Condorcet,and unanimity criteria. Our solution involves finding themaximum acyclic subgraph of a weighted directed graph.We present a method that dynamically switches betweenan exact branch and bound algorithm and a heuristicalgorithm, and show that for real data from closely relatedorganisms, an exact median can often be found.We presentexperimental results using seven populations of the cropplant Zea mays.
INDEX TERMS
Genetic map, median order, path and circuit problems, Kemeny distance, symmetric difference distance.
CITATION
Benjamin N. Jackson, Patrick S. Schnable, Srinivas Aluru, "Consensus Genetic Maps as Median Orders from Inconsistent Sources", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.5, no. 2, pp. 161-171, April-June 2008, doi:10.1109/TCBB.2007.70221
REFERENCES
 [1] A.V. Aho, M.R. Garey, and J.D. Ulman, “The Transitive Reduction of a Directed Graph,” SIAM J. Computing, vol. 1, pp. 131-137, 1972. [2] N. Ailon, M. Charikar, and A. Newman, “Aggregating Inconsistent Information: Ranking and Clustering,” Proc. 37th Ann. ACM Symp. Theory of Computing, pp. 684-693, 2005. [3] K. Arrow, Social Choice and Individual Values. John Wiley, 1951. [4] J.P. Barthelemy and B. Monjardet, “The Median Procedure in Data Analysis: New Results and Open Problems,” Proc. Second Conf. Int'l Federation of Classification Societies, pp. 309-316, 1988. [5] T.H. Corman, C.E. Leiserson, R.L. Rivest, and C. Stein, Introduction to Algorithms, second ed. 2003. [6] A. Davenport and J. Kalagnanam, “A Computational Study of the Kemeny Rule for Preference Aggregation,” Proc. Nat'l Conf. Artificial Intelligence, 2004. [7] C. Demetrescu and F.I. Giuseppe, “Trade-Offs for Fully Dynamic Transitive Closure on DAGs: Breaking through the $O(n^{2})$ Barrier,” J. ACM, vol. 52, pp. 147-156, 2005. [8] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar, “Rank Aggregation Methods for the Web,” Proc. 10th Int'l Conf. World Wide Web, 2001. [9] P. Eades, X. Lin, and W.F. Smith, “A Fast and Effective Solution for the Feedback Arc Set Problem,” Information Processing Letters, vol. 47, pp. 319-323, 1998. [10] G. Even, J. Naor, B. Schieber, and M. Sudan, “Approximating Minimum Feedback Sets and Multicuts in Directed Graphs,” Algorithmica, vol. 20, pp. 151-174, 1998. [11] C.T. Falk, “Preliminary Ordering of Multiple Linked Loci Using Pairwise Linkage Data,” J. Quantum Trait Loci, vol. 2, 1992. [12] R.W. Floyd, “Algorithm 97: Shortest Path,” Comm. ACM, vol. 5, p.345, 1962. [13] Y. Fu, T.J. Wen, Y.I. Ronin, H.D. Chen, L. Guo, D.I. Mester, Y. Yang, M. Lee, A.B. Korol, D.A. Ashlock, and P.S. Schnable, “Genetic Dissection of Intermated Recombinant Inbred Lines Using a New Genetic Map of Maize,” Genetics, vol. 174, pp. 1671-1683, 2006. [14] A. Goralcikova and K. Koubek, “A Reduct-and-Closure Algorithm for Graphs,” Math. Foundations of Computer Science, vol. 74, pp.301-307, 1979. [15] S. Hannenhalli and P.A. Pevzner, “Transforming Men into Mice (Polynomial Algorithm for Genomic Distance Problem),” Proc. 36th Ann. IEEE Symp. Foundations of Computer Science, pp. 581-592, 1995. [16] S. Hannenhalli and P.A. Pevzner, “Transforming Cabbage into Turnip (Polynomial Algorithm for Sorting Signed Permutations by Reversals,” J. ACM, vol. 48, pp. 1-27, 1999. [17] O. Hudrey, “Computation of Median Orders: Complexity Results,” Proc. DIMACS-LAMSADE Workshop Computer Science and Decision Theory, 2004. [18] B. Jackson, S. Aluru, and P. Schnable, “Consensus Genetic Maps: A Graph Theoretic Approach,” Proc. IEEE Computational Systems Bioinformatics Conf., pp. 35-43, 2005. [19] D.B. Johnson, “Finding All the Elementary Circuits of a Directed Graph,” SIAM J. Computing, vol. 4, pp. 77-84, 1975. [20] J.P. Kemeny, “Mathematics without Numbers,” Daedelus, vol. 88, pp. 577-591, 1959. [21] S. Knapp, C. Echt, and B.H. Liu, “Genome Mapping with Non-Inbred Crosses Using Gmendel 2.0,” Maize Genetics Cooperation Newsletter, vol. 66, pp. 22-79, 1992. [22] E. Lander, P. Green, J. Abrahamson, A. Barlow, M.J. Daly, S.E. Lincoln, and L. Newburg, “An Interactive Computer Package for Constructing Primary Genetic Linkage Maps of Experimental and Natural Populations,” Genomics, vol. 1, pp. 174-181, 1997. [23] M. Lee, N. Sharopova, W.D. Beavis, D. Grant, M. Katt, D. Blair, and A. Hallauer, “Expanding the Genetic Map of Maize with Intermated B73 Mo17 (IBM) Population,” Plant Molecular Biology, vol. 48, pp. 453-461, 2002. [24] A. Levenglick, “Fair and Reasonable Election Systems,” Behavioral Science, vol. 20, pp. 34-46, 1975. [25] L. Lovasz, “On the Ratio of Optimal Integral and Fractional Covers,” Discrete Math., vol. 13, pp. 383-390, 1964. [26] D. Mester and O. Braysy, “Fast and High Precision Algorithms for Optimization in Large-Scale Genomic Problems,” Computational Biology and Chemistry, vol. 28, pp. 281-289, 2004. [27] D. Mester, E. Ronin, E. Nevo, and A. Korol, “Constructing Large Scale Genetic Maps Using Evolutionary Strategy Algorithm,” Genetics, vol. 165, pp. 2269-2282, 2003. [28] M.E. Moret, J. Tang, and T. Warnow, “Reconstructing Phylogenies from Gene-Content and Gene-Order Data,” Math. Evolution and Phylogeny, O. Gascuel, ed., chapter 12, pp.321-352, Oxford Univ. Press, 2006. [29] J.M. Olson and M. Boehnke, “Monte Carlo Comparison of Preliminary Methods of Ordering Multiple Genetic Loci,” Am. J. Human Genetics, vol. 47, pp. 470-482, 1990. [30] J. Ott, Analysis of Human Genetic Linkage. John Hopkins Univ. Press, 1985. [31] D. Sankoff, C. Zheng, and A. Lenert, “Reversals of Fortune,” Proc. RECOMB Workshop Comparative Genomics, pp. 131-141, 2005. [32] T. Schiex and C. Gaspin, “CARTHAGENE: Constructing and Joining Maximum Likelihood Genetic Maps,” Proc. Fifth Int'l Conf. Intelligent Systems in Molecular Biology, vol. 48, pp. 453-461, 1997. [33] P. Slavik, “A Tight Analysis of the Greedy Algorithm for Set Cover,” Proc. 28th Ann. ACM Symp. Theory of Computing, pp. 435-441, 1996. [34] M. Syslo, N. Deo, and J. Kowalik, Discrete Optimization Algorithms and Pascal Programs. Prentice Hall, 1983. [35] M. Truchon, “An Extension of the Condorcet Criterion and Kemeny Orders,” cahier 98-15 du Centre de Recherche en Economie et Finance Appliquees, 1998. [36] M. Truchon, “Aggregation of Rankings in Figure Skating,” Cahiers de Rcherche, 2005. [37] Y. Wakabayashi, “The Complexity of Computing Medians of Relations,” Resenhas, vol. 3, pp. 323-349, 1998. [38] S. Yancopoulos, O. Attie, and R. Friedberg, “Efficient Sorting of Genomic Permutations by Translocation, Inversion, and Block Interchange,” Bioinformatics, vol. 21, pp. 3340-3346, 2005. [39] H. Yao and P.S. Schnable, “Cis-Effects on Meiotic Recombination Across Distinct a1-sh2 Intervals in a Common Zea Genetic Background,” Genetics, vol. 170, pp. 1929-1944, 2004. [40] I.V. Yap, D. Schneider, J. Kleinberg, D. Matthews, S. Cartinhour, and S.R. McCough, “A Graph-Theoretic Approach to Comparing and Integrating Genetic Physical and Sequence-Based Maps,” Genetics, vol. 165, pp. 2235-2247, 2003.