CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2012 vol.9 Issue No.03 - May-June
Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format
Issue No.03 - May-June (2012 vol.9)
A. Bustamam , Dept. of Math., Univ. of Indonesia, Depok, Indonesia
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2011.68
Markov clustering (MCL) is becoming a key algorithm within bioinformatics for determining clusters in networks. However, with increasing vast amount of data on biological networks, performance and scalability issues are becoming a critical limiting factor in applications. Meanwhile, GPU computing, which uses CUDA tool for implementing a massively parallel computing environment in the GPU card, is becoming a very powerful, efficient, and low-cost option to achieve substantial performance gains over CPU approaches. The use of on-chip memory on the GPU is efficiently lowering the latency time, thus, circumventing a major issue in other parallel computing environments, such as MPI. We introduce a very fast Markov clustering algorithm using CUDA (CUDA-MCL) to perform parallel sparse matrix-matrix computations and parallel sparse Markov matrix normalizations, which are at the heart of MCL. We utilized ELLPACK-R sparse format to allow the effective and fine-grain massively parallel processing to cope with the sparse nature of interaction networks data sets in bioinformatics applications. As the results show, CUDA-MCL is significantly faster than the original MCL running on CPU. Thus, large-scale parallel computation on off-the-shelf desktop-machines, that were previously only possible on supercomputing architectures, can significantly change the way bioinformaticians and biologists deal with their data.
parallel architectures, bioinformatics, graphics processing units, large-scale systems, Markov processes, supercomputing architectures, fast parallel Markov clustering, bioinformatics, ELLPACK-R sparse format, biological networks, critical limiting factor, GPU computing, CUDA tool, massively parallel computing environment, on-chip memory, fast Markov clustering algorithm, parallel sparse matrix-matrix computations, parallel sparse Markov matrix normalizations, fine-grain massively parallel processing, interaction networks data sets, large-scale parallel computation, off-the-shelf desktop-machines, Graphics processing unit, Proteins, Instruction sets, Bioinformatics, Parallel processing, Multicore processing, Markov processes, bioinformatics., Markov clustering, graphs and networks, GPU computing, PPI networks, CUDA, ELLPACK-R sparse format, scalable parallel programming, parallelism and concurrency, performance evaluation
A. Bustamam, "Fast Parallel Markov Clustering in Bioinformatics Using Massively Parallel Computing on GPU with CUDA and ELLPACK-R Sparse Format", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.9, no. 3, pp. 679-692, May-June 2012, doi:10.1109/TCBB.2011.68