The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - May/June (2011 vol.31)
pp: 8-19
Ron O. Dror , D.E. Shaw Research
J.P. Grossman , D.E. Shaw Research
Kenneth M. Mackenzie , D.E. Shaw Research
Brian Towles , D.E. Shaw Research
Edmond Chow , D.E. Shaw Research
John K. Salmon , D.E. Shaw Research
Cliff Young , D.E. Shaw Research
Joseph A. Bank , D.E. Shaw Research
Brannon Batson , D.E. Shaw Research
Martin M. Deneroff , D.E. Shaw Research
Jeffrey S. Kuskin , D.E. Shaw Research
Richard H. Larson , D.E. Shaw Research
Mark A. Moraes , D.E. Shaw Research
David E. Shaw , D.E. Shaw Research
ABSTRACT
<p>Anton, a massively parallel special-purpose machine that accelerates molecular dynamics simulations by orders of magnitude, uses a combination of specialized hardware mechanisms and restructured software algorithms to reduce and hide communication latency. Anton delivers end-to-end internode latency significantly lower than any other large-scale parallel machine, and its critical-path communication time for molecular dynamics simulations is less than 3 percent that of the next-fastest platform.</p>
INDEX TERMS
Data communications, interprocessor communications, multiprocessor systems, network communication, parallel systems, special-purpose hardware, Anton
CITATION
Ron O. Dror, J.P. Grossman, Kenneth M. Mackenzie, Brian Towles, Edmond Chow, John K. Salmon, Cliff Young, Joseph A. Bank, Brannon Batson, Martin M. Deneroff, Jeffrey S. Kuskin, Richard H. Larson, Mark A. Moraes, David E. Shaw, "Overcoming Communication Latency Barriers in Massively Parallel Scientific Computation", IEEE Micro, vol.31, no. 3, pp. 8-19, May/June 2011, doi:10.1109/MM.2011.38
REFERENCES
1. A. Bhatelé et al., "Overcoming Scaling Challenges in Biomolecular Simulations Across Multiple Platforms," Proc. IEEE Int'l Symp. Parallel and Distributed Processing, IEEE Press, 2008, doi:10.1109/IPDPS.2008.4536317.
2. K.J. Bowers et al., "Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters," Proc. ACM/IEEE Conf. Supercomputing (SC 06), IEEE, 2006, doi:10.1145/1188455.1188544.
3. B.G. Fitch et al., "Blue Matter: Approaching the Limits of Concurrency for Classical Molecular Dynamics," Proc. 2006 ACM/IEEE Conf. Supercomputing (SC 06), ACM Press, 2006, doi:10.1145/1188455.1188547.
4. T. Narumi et al., "A 55 TFLOPS Simulation of Amyloid-Forming Peptides from Yeast Prion Sup35 with the Special-Purpose Computer System MDGRAPE-3," Proc. ACM/IEEE Conf. Supercomputing (SC 06), ACM Press, 2006, doi:10.1145/1188455.1188506.
5. J.C. Phillips, J.E. Stone, and K. Schulten, "Adapting a Message-Driven Parallel Application to GPU-Accelerated Clusters," Proc. ACM/IEEE Conf. Supercomputing (SC 08), IEEE Press, 2008, no. 8.
6. D.E. Shaw et al., "Millisecond-Scale Molecular Dynamics Simulations on Anton," Proc. Conf. High Performance Computing Networking, Storage and Analysis, ACM Press, 2009, doi:10.1145/1654059.1654099.
7. R.O. Dror et al., "Exploiting 162-Nanosecond End-to-End Communication Latency on Anton," Proc. 2010 ACM/IEEE Int'l Conf. High Performance Computing, Networking, Storage and Analysis, IEEE CS Press, 2010, doi:10.1109/SC.2010.23.
8. E. Chow et al., Desmond Performance on a Cluster of Multicore Processors, tech. report DESRES/TR-2008-01, D.E. Shaw Research, 2008.
9. D. Kerbyson et al., "Performance Evaluation of an EV7 AlphaServer Machine," Int'l J. High Performance Computing Applications, vol. 18, no. 2, 2004, pp. 199-209.
10. R. Fatoohi, S. Saini, and R. Ciotti, "Interconnect Performance Evaluation of SGI Altix 3700 BX2, Cray X1, Cray Opteron Cluster, and Dell PowerEdge," Proc. 20th Int'l Parallel and Distributed Processing Symp. (IPDPS 06), IEEE Press, 2006, doi:10.1109/SC.2005.11.
11. J. Beecroft et al., "QsNetII: Defining High-Performance Network Design," IEEE Micro, vol. 25, no. 4, 2005, pp. 34-47.
12. R. Biswas et al., "An Application-Based Performance Characterization of the Columbia Supercomputer," Proc. ACM/IEEE Conf. Supercomputing (SC 05), IEEE CS Press, 2005, doi:10.1109/SC.2005.11.
13. M.D. Noakes, D.A. Wallach, and W.J. Dally, "The J-Machine Multicomputer: An Architectural Evaluation," ACM SIGARCH Computer Architecture News, vol. 21, no. 2, 1993, pp. 224-235.
14. S. Kumar et al., "The Deep Computing Messaging Framework: Generalized Scalable Message Passing on the Blue Gene/P Supercomputer," Proc. 22nd Ann. Int'l Conf. Supercomputing (ICS 08), ACM Press, 2008, pp. 94-103.
15. K.J. Barker et al., "Entering the Petaflop Era: The Architecture and Performance of Roadrunner," Proc. ACM/IEEE Conf. Supercomputing (SC 08), IEEE Press, 2008, no. 1.
16. S.L. Scott, "Synchronization and Communication in the T3E Multiprocessor," Proc. 7th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, ACM Press, 1996, pp. 26-36.
17. A. Hoisie et al., "A Performance Comparison through Benchmarking and Modeling of Three Leading Supercomputers: Blue Gene/L, Red Storm, and Purple," Proc. 2006 ACM/IEEE Conf. Supercomputing (SC 06), ACM Press, 2006, doi:10.1145/1188455.1188534.
18. S. Plimpton, "Fast Parallel Algorithms for Short-Range Molecular Dynamics," J. Computational Physics, vol. 117, no. 1, 1995, pp. 1-19.
19. G. Almási et al., "Optimization of MPI Collective Communication on BlueGene/L Systems," Proc. 19th Ann. Int'l Conf. Supercomputing (ICS 05), ACM Press, 2005, pp. 253-262.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool