Sixth IEEE International Symposium on Cluster Computing and the Grid Workshops (CCGRIDW'06)
Cluster and Grid Based Classification of Transposable Elements in Eukaryotic Genomes
Singapore
May 16-May 19
ISBN: 0-7695-2585-7
In the last few years many computer and laboratory improvements in the production and analysis of DNA sequences have made possible the complete sequencing of whole genomes. This provides a wealth of raw genomes that needs to be processed and annotated. All eukaryotic genomes examined and published thus far contain repetitive DNA. The amount of repetitive DNA in any specific eukaryotic genome ranges from 5% to 80%. These repeats consist mainly of transposable elements and tandem repeats which need to be identified, classified and annotated in order to sequence and annotate an entire genome. This paper discusses the design and implementation of a distributed cluster and grid based workflow to classify transposable elements. We show experimental results for representative species genomes on a cluster and grid. The performance and results of the workflow with regard to turnaround time, scalability, load balancing, resource utilization and fault tolerance are shown and discussed.
Index Terms:
In cluster, distributed workflow, transposable elements, bioinformatics.
Citation:
Nirmal Ranganathan, C?dric Feschotte, David Levine, "Cluster and Grid Based Classification of Transposable Elements in Eukaryotic Genomes," ccgrid, vol. 2, pp.45, Sixth IEEE International Symposium on Cluster Computing and the Grid Workshops (CCGRIDW'06), 2006