The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - Oct. (2012 vol.23)
pp: 1970-1982
Henry Hoffmann , Massachusetts Institute of Technology, Cambridge
Anant Agarwal , Massachusetts Institute of Technology, Cambridge
Srinivas Devadas , Massachusetts Institute of Technology, Cambridge
ABSTRACT
Design patterns for parallel computing attempt to make the field accessible to nonexperts by generalizing the common techniques experts use to develop parallel software. Existing parallel patterns have tremendous descriptive power, but it is often unclear to nonexperts how to choose a pattern based on the specific performance goals of a given application. This paper addresses the need for a pattern selection methodology by presenting four patterns and an accompanying decision framework for choosing from these patterns given an application's throughput and latency goals. The patterns are based on recognizing that one can partition an application's data or instructions and that these partitionings can be done in time or space, hence we refer to them as spatiotemporal partitioning strategies. This paper introduces a taxonomy that describes each of the resulting four partitioning strategies and presents a three-step methodology for selecting one or more given a throughput and latency goal. Several case studies are presented to illustrate the use of this methodology. These case studies cover several simple examples as well as more complicated applications including a radar processing application and an H.264 video encoder.
INDEX TERMS
Cameras, Indexes, Spatiotemporal phenomena, Throughput, Spatial databases, Security, Decision trees, parallel programming, Design patterns, parallel computing
CITATION
Henry Hoffmann, Anant Agarwal, Srinivas Devadas, "Selecting Spatiotemporal Patterns for Development of Parallel Applications", IEEE Transactions on Parallel & Distributed Systems, vol.23, no. 10, pp. 1970-1982, Oct. 2012, doi:10.1109/TPDS.2011.298
REFERENCES
[1] T. Mattson, B. Sanders, and B. Massingill, Patterns for Parallel Programming. Addison-Wesley Professional, 2004.
[2] S. Siu, M.D. Simone, D. Goswami, and A. Singh, "Design Patterns for Parallel Programming," Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications, 1996.
[3] D. Lea, Concurrent Programming in Java: Design Principles and Patterns. Addison-Wesley Longman Publishing Co., Inc., 1996.
[4] K. Keutzer and T. Mattson, "Our Pattern Language (OPL): A Design Pattern Language for Engineering (Parallel) Software," 2011.
[5] M. Snir, "Parallel Programming Patterns," http://www.cs.uiuc. edu/homes/snirPPP/, 2012.
[6] H. Hoffmann, A. Agarwal, and S. Devadas, "Partitioning Strategies: Spatiotemporal Patterns of Program Decomposition," Proc. 21st Int'l Conf. Parallel and Distributed Computing and Systems, 2009.
[7] L.-K. Liu, S. Kesavarapu, J. Connell, A. Jagmohan, L. hoon Leem, B. Paulovicks, V. Sheinin, L. Tang, and H. Yeo, "Video Analysis and Compression on the STI Cell Broadband Engine Processor," Proc. IEEE Int'l Conf. Multimedia and Expo (ICME), pp. 29-32, 2006.
[8] L.S. Blackford, J. Choi, A. Cleary, A. Petitet, R.C. Whaley, J. Demmel, I. Dhillon, K. Stanley, J. Dongarra, S. Hammarling, G. Henry, and D. Walker, "ScaLAPACK: a Portable Linear Algebra Library for Distributed Memory Computers—Design Issues and Performance," Proc. Int'l Conf. Supercomputing, 1996.
[9] R. van de Geijn, Using PLAPACK—Parallel Linear Algebra Package. MIT Press, 1997.
[10] High Performance Fortran Forum "High Performance Fortran Language Specification, Version 1.0," Technical Report CRPC-TR92225, 1993.
[11] M.J. Flynn, "Very High-Speed Computing Systems," Proc. IEEE, vol. 54, no. 12, pp. 1901-1909, Dec. 1966.
[12] "Supra-Linear Packet Processing Performance with Intel Multi-Core Processors," technical report, Intel, 2006.
[13] M. Gordon, W. Thies, and S. Amarasinghe, "Exploiting Coarse-Grained Task Data, and Pipeline Parallelism in Stream Programs," Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2006.
[14] N.E. Jerger, L.-S. Peh, and M. Lipasti, "Virtual Circuit Tree Multicasting: A Case for Hardware Multicast Support," Proc. Int'l Symp. Computer Architecture, June 2008.
[15] K. Theobald, G.R. Gao, and L.J. Hendren, "Speculative Execution and Branch Prediction on Parallel Machines," Int'l Conf. Supercomputing, pp. 77-86, 1993.
[16] M. Gordon, W. Thies, M. Karczmarek, J. Lin, A.S. Meli, C. Leger, A.A. Lamb, J. Wong, H. Hoffmann, D.Z. Maze, and S. Amarasinghe, "A Stream Compiler for Communication-Exposed Architectures," Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2002.
[17] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, "Brook for GPUs: Stream Computing on Graphics Hardware," ACM Trans. Graphics, vol. 23, pp. 777-786, 2004.
[18] "Intel Threading Building Blocks 2.1 for Open Source," http:/threadingbuildingblocks.org/, 2012.
[19] C. Bienia, S. Kumar, J.P. Singh, and K. Li, "The Parsec Benchmark Suite: Characterization and Architectural Implications," Proc. 17th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '08), 2008.
[20] J. Lebak, J. Kepner, H. Hoffmann, and E. Rutledge, "Parallel VSIPL++: An Open Standard Software Library for High-Performance Parallel Signal Processing," Proc. IEEE, vol. 93, no. 2, pp. 313-330, Feb. 2005.
[21] x264, http://www.videolan.orgx264.html, 2011.
[22] U. Meyer and P. Sanders, "Delta-Stepping: A Parallel Single Source Shortest Path Algorithm," Proc. Sixth Ann. European Symp. Algorithms, pp. 393-404, http://portal.acm.orgcitation.cfm?id= 647908.740136 , 1998.
40 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool