Bioinformatics, 2009 Ohio Collaborative Conference (2009)
Case Western University, Cleveland, Ohio
June 15, 2009 to June 17, 2009
Assembling expressed sequence tags (ESTs) is essential for removing redundancy and generating long virtual transcripts for EST annotation and gene finding. A number of assemblers are available, but there is a lack of detailed comparative assessment of the strength and weakness of these assemblers. We compared three assemblers including Phrap, CAP3 and TIGR Assembler (TA) using Aspergillus niger and Phanerochaete chrysosporium EST data. Phrap assembled more ESTs into contigs than TA and CAP3. Among the contigs and singletons generated by the three assemblers, 67 – 90% of them were identical. The number of contigs and singletons assembled by Phrap provides an estimate of the maximum number of unique genes represented in the dataset, while the numbers generated by TA and CAP3 provide an approximate estimate of unique transcripts since both TA and CAP are more discriminating to alternatively spliced transcripts. The error rate in contigs generated by Phrap was slightly higher than contigs generated by TA or CAP3. Phrap is thus recommended for EST assembling aiming at generating a set of unisequences with minimum redundancy for estimating the unigene number, and TA or CAP3 are used for assembling ESTs aiming at finding unique transcripts, i. e., for identification of alternative splicing.
EST, assembler, transcript, alternative splicing
Reginald Storms, Xiangjia J. Min, Adrian Tsang, Gregory Butler, "Comparative Assessment of DNA Assemblers for Assembling Expressed Sequence Tags", Bioinformatics, 2009 Ohio Collaborative Conference, vol. 00, no. , pp. 79-82, 2009, doi:10.1109/OCCBIO.2009.19