15th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'03)
Comparison of Genomes Using High-Performance Parallel Computing
S?o Paulo, SP - Brazil
November 10-November 12
ISBN: 0-7695-2046-4
Comparison of the DNA sequences and genes of two genomes can be useful to investigate the common functionalities of the corresponding organisms and get a better understanding of how the genes or groups of genes are organized and involved in several functions. In this paper we use high-performance parallel computing to compare the whole genomes of two organisms, namely Xanthomonas axonopodis pv. citri and Xanthomonas campestris pv. campestris, each with more than five million base-pairs. Our purpose is two-fold. First we intend to exploit the high-performance power of a cluster of low-cost microcomputers, propose a parallel solution to this problem, and show its feasibility with implementation and performance results. Second we do additional comparisons of the two genomes by locating and compare not only the homologous genes (expressed in terms of the 20-letter amino acids) but also compare the regions or gaps (in terms of the 4-letter DNA nucleotides) between the corresponding homologous genes. We have implemented the proposed comparison strategy to compare the two genomes Xanthomonas axonopodis pv. citri (Xac) and Xanthomonas campestris pv. campestris (Xcc). The parallel platform used is a Beowulf cluster of 64 nodes consisting of low cost microcomputers. Xac has 5,175,554 base pairs and 4,313 protein-coding genes while Xcc has 5,076,187 base pairs and 4,182 protein-coding genes. The parallel solution is based on the dynamic programming approach and presents not only less processing time, but also better quality results as compared to approaches based on Blast and EGG.
Citation:
N. F. Almeida Jr., C. E. R. Alves, E. N. Caceres, S. W. Song, "Comparison of Genomes Using High-Performance Parallel Computing," sbac-pad, pp.142, 15th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'03), 2003