This Article 
 Bibliographic References 
 Add to: 
Streamline Integration Using MPI-Hybrid Parallelism on a Large Multicore Architecture
November 2011 (vol. 17 no. 11)
pp. 1702-1713
David Camp, Lawrence Berkeley National Laboratory, Berkeley and University of California, Davis, Davis
Christoph Garth, University of California, Davis, Davis
Hank Childs, Lawrence Berkeley National Laboratory, Berkeley and University of California, Davis, Davis
Dave Pugmire, Oak Ridge National Laboratory, Oak Ridge
Kenneth I. Joy, University of California, Davis, Davis
Streamline computation in a very large vector field data set represents a significant challenge due to the nonlocal and data-dependent nature of streamline integration. In this paper, we conduct a study of the performance characteristics of hybrid parallel programming and execution as applied to streamline integration on a large, multicore platform. With multicore processors now prevalent in clusters and supercomputers, there is a need to understand the impact of these hybrid systems in order to make the best implementation choice. We use two MPI-based distribution approaches based on established parallelization paradigms, parallelize over seeds and parallelize over blocks, and present a novel MPI-hybrid algorithm for each approach to compute streamlines. Our findings indicate that the work sharing between cores in the proposed MPI-hybrid parallel implementation results in much improved performance and consumes less communication and I/O bandwidth than a traditional, nonhybrid distributed implementation.

[1] C. Garth, F. Gerhardt, X. Tricoche, and H. Hagen, “Efficient Computation and Visualization of Coherent Structures in Fluid Flow Applications,” IEEE Trans. Visualization and Computer Graphics, vol. 13, no. 6, pp. 1464-1471, Nov./Dec. 2007.
[2] H. Krishnan, C. Garth, and K.I. Joy, “Time and Streak Surfaces for Flow Visualization in Large Time-Varying Data Sets,” IEEE Trans. Visualization and Computer Graphics, vol. 15, no. 6, pp. 1267-1274, Nov./Dec. 2009.
[3] T. McLoughlin, R.S. Laramee, R. Peikert, F.H. Post, and M. Chen, “Over Two Decades of Integration-Based, Geometric Flow Visualization,” Computer Graphics Forum, vol. 29, no. 6, pp. 1807-1829, 2010.
[4] D. Pugmire, H. Childs, C. Garth, S. Ahern, and G. Weber, “Scalable Computation of Streamlines on Very Large Datasets,” Proc. Int'l Conf. Supercomputing, 2009.
[5] D. Sujudi and R. Haimes, “Integration of Particles and Streamlines in a Spatially-Decomposed Computation,” Proc. IEEE Parallel Computational Fluid Dynamics Conf., 1996.
[6] D.A. Lane, “UFAT—A Particle Tracer for Time-Dependent Flow Fields,” Proc. IEEE Conf. Visualization, pp. 257-264, 1994.
[7] B. Cabral and L.C. Leedom, “Highly Parallel Vector Visualization Using Line Integral Convolution,” Proc. SIAM Conf. Parallel Processing for Scientific Computing (PPSC '95), pp. 802-807, 1995,
[8] S. Muraki, E.B. Lum, K.-L. Ma, M. Ogata, and X. Liu, “A PC Cluster System for Simultaneous Interactive Volumetric Modeling and Visualization,” Proc. the IEEE Symp. Parallel and Large-Data Visualization and Graphics (PVG '03), p. 13, 2003,
[9] D. Ellsworth, B. Green, and P. Moran, “Interactive Terascale Particle Visualization,” Proc. IEEE Conf. Visualization, pp. 353-360, 2004,
[10] S.-K. Ueng, C. Sikorski, and K.-L. Ma, “Out-of-Core Streamline Visualization on Large Unstructured Meshes,” IEEE Trans. Visualization and Computer Graphics, vol. 3, no. 4, pp. 370-380, Oct.- Dec. 1997.
[11] R. Bruckschen, F. Kuester, B. Hamann, and K.I. Joy, “Real-Time Out-of-Core Visualization of Particle Traces,” Proc. IEEE Symp. Parallel and Large-Data Visualization and Graphics (PVG '01), pp. 45-50, 2001,
[12] H. Yu, C. Wang, and K.-L. Ma, “Parallel Hierarchical Visualization of Large Time-Varying 3D Vector Fields,” Proc. Int'l Conf. Supercomputing, 2007.
[13] L. Chen and I. Fujishiro, “Optimizing Parallel Performance of Streamline Visualization for Large Distributed Flow Datasets,” Proc. IEEE VGTC Pacific Visualization Symp. '08, pp. 87-94, 2008,
[14] M. Snir, S. Otto, S. Huss-Lederman, D. Walker, and J. Dongarra, MPI—The Complete Reference: The MPI Core, second ed. MIT Press, 1998.
[15] D.R. Butenhof, Programming with POSIX Threads. Addison-Wesley Longman Publishing, 1997.
[16] R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. McDonald, and R. Menon, Parallel Programming in OpenMP. Morgan Kaufmann Publishers Inc., 2001.
[17] CUDA Programming Guide Version 2.3., NVIDIA Corporation, 2008.
[18] “High-Performance Fortran Language Specification, Version 1.0,” Technical Report CRPC-TR92225, High Performance Fortran Forum, 1997.
[19] T. El-Ghazawi, W. Carlson, T. Sterling, and K. Yelick, UPC—Distributed Shared Memory Programming. John Wiley & Sons, 2005.
[20] G. Hager, G. Jost, and R. Rabenseifner, “Communication Characteristics and Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes,” Proc. Cray User Group Conf., 2009.
[21] D. Mallón, G. Taboada, C. Teijeiro, T.J., B. Fraguela, A. Gómez, R. Doallo, and J. Mourino, “Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures,” Proc. European PVM/MPI Users' Group Meeting (EuroPVM/MPI), Sept. 2009.
[22] E. Endeve, C.Y. Cardall, R.D. Budiardja, and A. Mezzacappa, “Generation of Strong Magnetic Fields in Axisymmetry by the Stationary Accretion Shock Instability,” ArXiv E-Prints, Nov. 2008.
[23] C.Y. Cardall, A.O. Razoumov, E. Endeve, E.J. Lentz, and A. Mezzacappa, “Toward Five-Dimensional Core-Collapse Supernova Simulations,” J. Physics: Conf. Series, vol. 16, pp. 390-394, 2005.
[24] C. Sovinec, A. Glasser, T. Gianakon, D. Barnes, R. Nebel, S. Kruger, S. Plimpton, A. Tarditi, M. Chu, “Nonlinear Magnetohydrodynamics with High-Order Finite Elements,” J. Computational Physics, vol. 195, pp. 355-386, 2004.
[25] P. Fischer, J. Lottes, D. Pointer, and A. Siegel, “Petascale Algorithms for Reactor Hydrodynamics,” J. Physics: Conf. Series, vol. 125, pp. 1-5, 2008.
[26] “VisIt—Software that Delivers Parallel, Interactive Visualization,” http:/, 2011.
[27] H. Childs, E.S. Brugger, K.S. Bonnell, J.S. Meredith, M. Miller, B.J. Whitlock, and N. Max, “A Contract-Based System for Large Data Visualization,” Proc. IEEE Conf. Visualization, pp. 190-198, 2005.

Index Terms:
Concurrent programming, parallel programming, modes of computation, parallelism and concurrency, picture/image generation, display algorithms.
David Camp, Christoph Garth, Hank Childs, Dave Pugmire, Kenneth I. Joy, "Streamline Integration Using MPI-Hybrid Parallelism on a Large Multicore Architecture," IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 11, pp. 1702-1713, Nov. 2011, doi:10.1109/TVCG.2010.259
Usage of this product signifies your acceptance of the Terms of Use.