2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (2012)
Shanghai, China China
May 21, 2012 to May 25, 2012
Bioinformatics is a quickly emerging area of science with many important applications to human life. Sequence alignment in various forms is one of the main instruments used in bioinformatics. This work is motivated by the ever-increasing amount of sequence data that requires more and more computation power for its processing. This task calls for new GPU-based systems and their higher computational potential and energy efficiency as compared to CPUs. We address the problem of facilitating faster sequence alignment using modern multi-GPU clusters. Our initial step was to develop a fast and scalable GPU exact short sequence aligner. We used matching algorithm with small memory footprint based on Burrows-Wheeler transform. We developed a mathematical model of computation and communication costs to find optimal memory partitioning strategy for index and queries. Our solution achieves 10 times speedup over previous implementation based on suffix array on one GPU and scales to multiple GPUs. Our next step will be to adapt the suggested data structure and performance model for multi-node multi-GPU approximate sequence alignment. It is also planned to use exact matching to detect common regions in large sequences and use it as an intermediate step in full-scale genome comparison.
Indexes, Graphics processing unit, Bioinformatics, Genomics, Performance evaluation, Arrays, Complexity theory, Burrows-Wheeler transform, GPU, alignment
A. Drozd, N. Maruyama and S. Matsuoka, "Sequence Alignment on Massively Parallel Heterogeneous Systems," 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum(IPDPSW), Shanghai, China China, 2012, pp. 2498-2501.