2004 IEEE Computational Systems Bioinformatics Conference (CSB'04)
Space-Conserving Optimal DNA-Protein Alignment
Stanford, California
August 16-August 19
ISBN: 0-7695-2194-0
DNA-protein alignment algorithms can be used to discover coding sequences in a genomic sequence, if the corresponding protein derivatives are known. They can also be used to identify potential coding sequences of a newly sequenced genome, by using proteins from related species. Previously known algorithms either solve a simplified formulation, or sacrifice optimality to achieve practical implementation. In this paper, we present a comprehensive formulation of the DNA-protein alignment problem, and an algorithm to compute the optimal alignment in O(mn) time using only four tables of size (m + 1) ? (n + 1), wheremand n are the lengths of the DNA and protein sequences, respectively. We also developed a Protein and DNA Alignment program PanDA that implements the proposed solution. Experimental results indicate that our algorithm produces high quality alignments.
Citation:
Pang Ko, Mahesh Narayanan, Anantharaman Kalyanaraman, Srinivas Aluru, "Space-Conserving Optimal DNA-Protein Alignment," csb, pp.80-88, 2004 IEEE Computational Systems Bioinformatics Conference (CSB'04), 2004