Issue No.08 - August (2006 vol.17)
Mehmet M. Dalkilic , IEEE
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2006.104
<p><b>Abstract</b>—Many applications in Comparative Genomics lend themselves to implementations that take advantage of common high-performance features in modern microprocessors. However, the common suggestion that a data-parallel, multithreaded, or high-throughput implementation is possible often ignores the complexity of actually creating such software. In this paper, we present two parallel algorithms for a classic comparative genomics algorithm, the dot plot. First, we describe a data-parallel algorithm that achieves speedups of up to 14.4x over the sequential version for large genomic comparisons. Then, we use the new algorithm as the base for a coarse-grained parallel version, suitable for multiprocessor and cluster environments, that scales linearly with the number of processors. These speedups introduce the opportunity to perform full pairwise comparisons on entire genomes on a much larger scale than previously possible. We also present the experimental, model-driven approach used to develop the algorithm that allowed us to carefully study and evaluate implementation options and to fully understand the parameters affecting its performance.</p>
Dot plot, data-parallel, pairwise comparison, sequence alignment, vector processor, Altivec, high-performance computing, comparative genomics, performance measures.
Mehmet M. Dalkilic, Christopher Mueller, "High-Performance Direct Pairwise Comparison of Large Genomic Sequences", IEEE Transactions on Parallel & Distributed Systems, vol.17, no. 8, pp. 764-772, August 2006, doi:10.1109/TPDS.2006.104