Subscribe

Issue No.06 - June (2008 vol.19)

pp: 764-778

ABSTRACT

With advances in reconfigurable hardware, especially field-programmable gate arrays (FPGAs), it has become possible to use reconfigurable hardware to accelerate complex applications, such as those in scientific computing. There has been a resulting development of reconfigurable computers--computers which have both general purpose processors and reconfigurable hardware, as well as memory and high-performance interconnection networks. In this paper, we describe the acceleration of molecular dynamics simulation with reconfigurable computers. We evaluate several design alternatives for the reconfigurable computer implementation. We show that a single node accelerated with reconfigurable hardware--utilizing fine-grained parallelism in the reconfigurable hardware design--is able to achieve a speed-up of about 2X over the corresponding software-only simulation. We then parallelize the application and study the effect of acceleration on performance and scalability. Specifically, we study strong scaling in which the problem size is fixed. We find that the unaccelerated version actually scales better because it spends more time in computation than the accelerated version does. However, we also find that a cluster of P accelerated nodes gives better performance than a cluster of 2P unaccelerated nodes.

INDEX TERMS

Reconfigurable hardware, Distributed architectures, Physics, Chemistry

CITATION

Maya B. Gokhale, Frans Trouw, Ronald Scrofano, "Accelerating Molecular Dynamics Simulations with Reconfigurable Computers",

*IEEE Transactions on Parallel & Distributed Systems*, vol.19, no. 6, pp. 764-778, June 2008, doi:10.1109/TPDS.2007.70777REFERENCES

- [1] S. Kumar, C. Huang, G. Almasi, and L.V. Kalé, “Achieving Strong Scaling with NAMD on Blue Gene/L,”
Proc. 20th IEEE Int'l Parallel and Distributed Processing Symp. (IPDPS), 2006.- [2] K.J. Bowers, E. Chow, H. Xu, R.O. Dror, M.P. Eastwood, B.A. Gregerson, J.L. Klepeis, I. Kolossvary, M.A. Moraes, F.D. Sacerdoti, J.K. Salmon, Y. Shan, and D.E. Shaw, “Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters,”
SC '06: Proc. ACM/IEEE Conf. Supercomputing, 2006.- [3] B.G. Fitch, A. Rayshubskiy, M. Eleftheriou, T.C. Ward, M.E. Giampapa, M.C. Pitman, and R.S. Germain, “Blue Matter: Approaching the Limits of Concurrency for Classical Molecular Dynamics,”
SC '06: Proc. ACM/IEEE Conf. Supercomputing, 2006.- [4] M. Erez, J.H. Ahn, A. Garg, W.J. Dally, and E. Darve, “Analysis and Performance Results of a Molecular Modeling Application on Merrimac,”
SC '04: Proc. ACM/IEEE Conf. Supercomputing, 2004.- [5] Z. Fan, F. Qiu, A. Kaufman, and S. Yoakum-Stover, “GPU Cluster for High Performance Computing,”
SC '04: Proc. ACM/IEEE Conf. Supercomputing, 2004.- [6] Y. Gu, T. VanCourt, and M.C. Herbordt, “Improved Interpolation and System Integration for FPGA-Based Molecular Dynamics Simulations,”
Proc. 16th Int'l Conf. Field Programmable Logic and Applications (FPL '06), pp. 21-28, Aug. 2006.- [8]
IBM to Build World's First Cell Broadband Engine Based Supercomputer, http://www-03.ibm.com/press/us/en/pressrelease 20210.wss, Sept. 2006.- [9] R. Scrofano and V.K. Prasanna, “Computing Lennard-Jones Potentials and Forces with Reconfigurable Hardware,”
Proc. Int'lConf. Eng. Reconfigurable Systems and Algorithms (ERSA '04), pp. 284-290, June 2004.- [10] M.C. Smith, J.S. Vetter, and X. Liang, “Accelerating Scientific Applications with the SRC-6 Reconfigurable Computer: Methodologies and Analysis,”
Proc. 12th Reconfigurable Architectures Workshop (RAW '05), Apr. 2005.- [11] C. Wolinski, F. Trouw, and M. Gokhale, “A Preliminary Study ofMolecular Dynamics on Reconfigurable Computers,”
Proc. Int'lConf. Eng. Reconfigurable Systems and Algorithms (ERSA '03), pp. 304-307, June 2003.- [12] M. Gokhale, J. Frigo, C. Ahrens, J.L. Tripp, and R. Minnich, “Monte Carlo Radiative Heat Transfer Simulation on a Reconfigurable Computer,”
Proc. 14th Int'l Conf. Field Programmable Logic and Its Applications (FPL '04), pp. 95-104, Sept. 2004.- [13] W.D. Smith and A.R. Schnore, “Towards an RCC-Based Accelerator for Computational Fluid Dynamics Applications,”
Proc. Int'l Conf. Eng. Reconfigurable Systems and Algorithms (ERSA '03), pp. 222-231, June 2003.- [14] M.B. Gokhale and P.S. Graham,
Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays. Springer, 2005.- [15]
SRC Computers, http:/www.srccomputers.com, 2007.- [16] Cray, http:/www.cray.com, 2005.
- [17] R. Scrofano and V.K. Prasanna, “A Hierarchical Performance Model for Reconfigurable Computers,”
Handbook of Parallel Computing: Models, Algorithms and Applications, Chapman andHall/CRC Computer and Information Science Series, S. Rajasekaran and J. Reif, eds., CRC Press, 2008.- [18] R. Scrofano and V.K. Prasanna, “Preliminary Investigation of Advanced Electrostatics in Molecular Dynamics on Reconfigurable Computers,”
SC '06: Proc. ACM/IEEE Conf. Supercomputing, 2006.- [19] M. Allen and D.J. Tildesley,
Computer Simulation of Liquids. Oxford Univ. Press, 1987.- [20] T. Schlick,
Molecular Modeling and Simulation: An Interdisciplinary Guide. Springer, 2006.- [21] M. Patra, M. Karttunen, M.T. Hyvönen, E. Falck, P. Lindqvist, and I. Vattulainen, “Molecular Dynamics Simulations of Lipid Bilayers: Major Artifacts Due to Truncating Electrostatic Interactions,”
Biophysics J., vol. 84, pp. 3636-3645, 2003.- [22]
AMBER 8 User's Manual, http://amber.scripps.edu/doc8amber8.pdf, 2004.- [23]
PDB Format Guide, http://www.rcsb.org/pdb/file_formats/pdb/ pdbguide2.2guide2.2_frame.htm, 1996.- [24] Intel, http:/www.intel.com, 2006.
- [25]
OProfile, http://oprofile.sourceforge.netabout/, 2007.- [27] L. Phillips, R. Sinkovits, E. Oran, and J. Boris, “The Interaction ofShocks and Defects in Lennard-Jones Crystals,”
J. Physics: Condensed Matter, vol. 5, no. 35, pp. 6357-6376, Aug. 1993.- [28]
SRC C Programming Environment v2.1 Guide. SRC Computers, Aug. 2005.- [29] V. Kindratenko, “Code Partitioning for Reconfigurable High-Performance Computing: A Case Study,”
Proc. Int'l Conf. Eng. Reconfigurable Systems and Algorithms (ERSA '06), pp. 143-152, June 2006.- [32] A. Nakano,
Class Notes for CSCI 599: High-Performance Scientific Computing. Univ. of Southern California, Fall Semester, 2003.- [33]
MPI: A Message Passing Interface, http://www.mpi-forum.org/docs/mpi-11-html mpi-report.html, 1995.- [34] G. Burns, R. Daoud, and J. Vaigl, “LAM: An Open Cluster Environment for MPI,”
Proc. Supercomputing Symp. '94, pp. 379-386, 1994.- [35] J.M. Squyres and A. Lumsdaine, “A Component Architecture for LAM/MPI,”
Proc. 10th European PVM/MPI Users' Group Meeting (EuroPVM/MPI '03), pp. 379-387, Sept./Oct. 2003.- [38] V. Kindratenko and D. Pointer, “A Case Study in Porting a Production Scientific Supercomputing Application to a Reconfigurable Computer,”
Proc. 14th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '06), pp. 13-22, Apr. 2006.- [42] Y. Gu and M.C. Herbordt, “FPGA-Based Multigrid Computation for Molecular Dynamics Simulations,”
Proc. 15th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM '07), Apr. 2007.- [43] T. Narumi, Y. Ohno, N. Okimoto, T. Koishi, A. Suenaga, N. Futatsugi, R. Yanai, R. Himeno, S. Fujikawa, and M. Taiji, “A 55 TFLOPS Simulation of Amyloid-Forming Peptides from Yeast Prion Sup35 with the Special-Purpose Computer System MDGRAPE-3,”
SC '06: Proc. ACM/IEEE Conf. Supercomputing, 2006.- [44] R. Germain, Y. Zhestkov, M. Eleftheriou, A. Rayshubskiy, F. Suits, T. Ward, and B. Fitch, “Early Performance Data on the Blue Matter Molecular Simulation Framework,”
IBM J. Research and Development, vol. 49, nos. 2/3, pp. 447-455, Mar.-May 2005.- [45] L. Zhuo and V.K. Prasanna, “Scalable Modular Algorithms forFloating-Point Matrix Multiplication on FPGAs,”
Proc. 11thReconfigurable Architectures Workshop (RAW), 2004.- [46] K.D. Underwood and K.S. Hemmert, “Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance,”
Proc. 12th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2004.- [47] G.R. Morris, R.D. Anderson, and V.K. Prasanna, “A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer,”
Proc. 14th Ann. IEEESymp. Field-Programmable Custom Computing Machines (FCCM'06), Apr. 2006.- [48] R.L. Walke, R.W.M. Smith, and G. Lightbody, “20-GFLOPS QRProcessor on a Xilinx Virtex-E FPGA,”
Proc. SPIE: Signal Processing Algorithms, Architectures, and Implementations X, vol. 4116, June 2000.- [49] K.S. Hemmert and K.D. Underwood, “An Analysis of the Double-Precision Floating-Point FFT on FPGAs,”
Proc. 13th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2005.- [50] K. Sano, T. Iizuka, and S. Yamamoto, “Systolic Architecture forComputational Fluid Dynamics on FPGAs,”
Proc. 15th Ann. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2007.- [51] K. Muriki, K.D. Underwood, and R. Sass, “RC-BLAST: Towards a Portable, Cost-Effective Open Source Hardware Implementation,”
Proc. Fourth IEEE Int'l Workshop High-Performance Computational Biology (HiCOMB), 2005.- [52] J.L. Tripp, H.S. Mortveit, A.A. Hansson, and M. Gokhale, “Metropolitan Road Traffic Simulation on FPGAs,”
Proc. 13thAnn. IEEE Symp. Field-Programmable Custom Computing Machines (FCCM), 2005. |