This Article 
 Bibliographic References 
 Add to: 
Fat versus Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions
February 2012 (vol. 23 no. 2)
pp. 280-287
Guido Klingbeil, University of Oxford, Oxford
Radek Erban, University of Oxford, Oxford
Mike Giles, University of Oxford, Oxford
Philip K. Maini, University of Oxford, Oxford
We explore two different threading approaches on a graphics processing unit (GPU) exploiting two different characteristics of the current GPU architecture. The fat thread approach tries to minimize data access time by relying on shared memory and registers potentially sacrificing parallelism. The thin thread approach maximizes parallelism and tries to hide access latencies. We apply these two approaches to the parallel stochastic simulation of chemical reaction systems using the stochastic simulation algorithm (SSA) by Gillespie [14]. In these cases, the proposed thin thread approach shows comparable performance while eliminating the limitation of the reaction system's size.

[1] Nvidia CUDA Programming Guide, Version 2.1, NVIDIA Corporation, 2701 San Tomas Expressway, vol. 12, 2008.
[2] R. Erban, S. Chapman, I. Kevrekidis, and T. Vejchodsky, "Analysis of a Stochastic Chemical System Close to a Sniper Bifurcation of Its Mean-Field Model," SIAM J. Applied Math., vol. 70, no. 3, pp. 984-1016, 2009.
[3] B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, and P. Walter, Molecular Biology of the Cell. Garland Science, 2007.
[4] F. Holstege, E. Jennings, J. Wyrick, T. Lee, C. Hengartner, M. Green, T. Golub, E. Lander, and R. Young, "Dissecting the Regulatory Circuitry of a Eukaryotic Genome," Cell, vol. 95, no. 5, pp. 717-728, Nov. 1998.
[5] J. Hasty, D. McMillen, F. Isaacs, and J. Collins, "Computational Studies of Gene Regulatory Networks: In Numero Molecular Biology," Nature Rev. Genetics, vol. 2, no. 4, pp. 268-279, Apr. 2001.
[6] T. Tian and K. Burrage, "Stochastic Models for Regulatory Networks of the Genetic Toggle Switch," Proc. Nat'l Academy of Sciences of USA, vol. 103, no. 22, pp. 8372-8377, 2006.
[7] G. Ewing, D. McNickle, and K. Pawlikowski, "Multiple Replications in Parallel: Distributed Generation of Data for Speeding up Quantitative Stochastic Simulation," Proc. Int'l Assoc. for Mathematics and Computers in Simulation (IMACS '97), pp. 397-402, 1997.
[8] L. Dematte and T. Mazza, "On Parallel Stochastic Simulation of Diffusive Systems," Proc. Sixth Int'l Conf. Computational Methods in Systems Biology (CMSB '08), pp. 191-210, 2008.
[9] T. Tian and K. Burrage, "Parallel Implementation of Stochastic Simulation for Large-Scale Cellular Processes," Proc. Eighth Int'l Conf. High Performance Computing and Grid in Asia-Pacific Region, vol. 0, pp. 621-626, 2005.
[10] A. Snavely, L. Carter, J. Boisseau, A. Majumdar, K.S. Gatlin, N. Mitchell, J. Feo, and B. Koblenz, "Multi-Processor Performance on the Tera MTA," Proc. IEEE/ACM Conf. Supercomputing (SC '98), pp. 4-4, 1998.
[11] A. Agarwal, J. Kubiatowicz, D. Kranz, B. Lim, D. Yeung, G. D'souza, and M. Parkin, "Sparcle: An Evolutionary Processor Design for Large-Scale Multiprocessors," IEEE Micro, vol. 13, no. 3, pp. 48-61, June 1993.
[12] F. Irigoin and R. Triolet, "Supernode Partitioning," POPL '88: Proc. 15th ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages, pp. 319-329, 1988.
[13] D. Gillespie, "A General Method for Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Reactions," J. Computational Physics, vol. 22, pp. 403-434, 1976.
[14] D. Gillespie, "Exact Stochastic Simulation of Coupled Chemical Reactions," J. Physical Chemistry, vol. 81, no. 25, pp. 2340-2361, 1977.
[15] M. Gibson and J. Bruck, "Efficient Exact Stochastic Simulation of Chemical Systems with Many Species and Many Channels," J. Physical Chemistry A, vol. 104, pp. 1876-1889, 2000.
[16] H. Li and L. Petzold, "Logarithmic Direct Method for Discrete Stochastic Simulation of Chemically Reacting Systems," technical report, Dept. of Computer Science, Univ. of California, , 2006.
[17] Y. Cao, H. Li, and L. Petzold, "Efficient Formulation of the Stochastic Simulation Algorithm for Chemically Reacting Systems," J. Chemical Physics, vol. 121, no. 9, pp. 4059-4067, 2004.
[18] E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "Nvidia Tesla: A Unified Graphics and Computing Architecture," IEEE CS Hot Chips, no. 19, pp. 39-45, Mar./Apr. 2008.
[19] J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable Parallel Programming with CUDA," ACM Queue, vol. 6, no. 2, pp. 40-53, Mar./Apr. 2008.
[20] P. Maciol and K. Banas, "Testing Tesla Architecture for Scientific Computing: The Performance of Matrix-Vector Product," Proc. Int'l Multiconf. Computer Science and Information Technology, vol. 3, pp. 285-291, 2008.
[21] Nvidia Compute PTX: Parallel Thread Execution, ISA Version 1.4, NVIDIA Corporation, 2701 San Tomas Expressway, vol. 3, 2009.
[22] G.M. Amdahl, "Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities," AFIPS '67 : Proc. Apr. 18-20, 1967, Spring Joint Computer Conf., pp. 483-485, 1967.
[23] D. Gillespie, "Approximate Accelerated Stochastic Simulation of Chemically Reacting Systems," J. Chemical Physics, vol. 115, no. 4, pp. 1716-1733, 2001.
[24] J. Murray, Mathematical Biology 1: An Introduction, third ed. Springer Verlag, 2002.
[25] J. Vilar, H. Kueh, N. Barkai, and S. Leibler, "Mechanisms of Noise-Resistance in Genetic Oscillators," Proc. Nat'l Academy of Sciences of USA, vol. 99, no. 9, pp. 5988-5992, 2002.
[26] D. Wilkinson, Stochastic Modelling for Systems Biology. Chapman & Hall/CRC, 2006.
[27] NVIDIA CUDA C Programming Best Practices Guide CUDA Toolkit 2.3, NVIDIA Corporation, 2701 San Tomas Expressway, July 2008.
[28] NVIDIAs Next Generation CUDA Compute Architecture: Fermi, NVIDIA Corporation, 2701 San Tomas Expressway, v 1.1, 2009.

Index Terms:
Parallel processing, compute unified device architecture (CUDA), graphics processing unit (GPU).
Guido Klingbeil, Radek Erban, Mike Giles, Philip K. Maini, "Fat versus Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 2, pp. 280-287, Feb. 2012, doi:10.1109/TPDS.2011.157
Usage of this product signifies your acceptance of the Terms of Use.