The Community for Technology Leaders
RSS Icon
Issue No.01 - January (2011 vol.22)
pp: 46-57
Song Jun Park , US Army Research Laboratory, Aberdeen Proving Ground, MD
James A. Ross , High Performance Technologies, Inc., Aberdeen Proving Ground, MD
Dale R. Shires , US Army Research Laboratory, Aberdeen Proving Ground, MD
David A. Richie , Brown Deer Technology, Forest Hill, MD
Brian J. Henz , US Army Research Laboratory, Aberdeen Proving Ground, MD
Lam H. Nguyen , US Army Research Laboratory, Aberdeen Proving Ground, MD
To move High-Performance Computing (HPC) closer to forward operating environments and missions, the Army Research Laboratory is developing approaches using hybrid, asymmetric core computing. By blending capabilities found in Graphics Processing Units (GPUs) and traditional von Neumann multicore Central Processing Units (CPUs), approaches are being developed and optimized to provide at or near real-time processing speeds for research project applications. Algorithms are designed to partition work to resources best designed to handle the processing load. The use of commodity resources allows the design to be flexible throughout the life cycle without the costly and time-consuming delays associated with Application-Specific Integrated Circuit (ASIC) development. This paradigm allows for rapid technology transfer to end users. In this paper, we describe a synchronous impulse reconstruction radar imaging algorithm that has been designed for hybrid CPU-GPU processing. We discuss various optimizations such as asynchronous task partitioning between the CPU and GPU as well as data movement reduction. We also discuss analysis and design of the algorithms within the context of two programming models: NVIDIA's CUDA and AMD's ATI Brook+. Finally, we report on the speedup achieved by this approach that allowed us to take a code once restricted to postprocessing and transform it into one that exceeds real-time performance requirements.
Heterogeneous (hybrid) systems, emerging technologies, signal processing systems, computers in other systems—military.
Song Jun Park, James A. Ross, Dale R. Shires, David A. Richie, Brian J. Henz, Lam H. Nguyen, "Hybrid Core Acceleration of UWB SIRE Radar Signal Processing", IEEE Transactions on Parallel & Distributed Systems, vol.22, no. 1, pp. 46-57, January 2011, doi:10.1109/TPDS.2010.117
[1] S.J. Park, D.R. Shires, and B.J. Henz, "Coprocessor Computing with FPGA and GPU," Proc. HPCMP Users Group Conf., pp. 366-370, 2008.
[2] C.C.L. Shen, "Evaluating Impulse C and Multiple Parallelism Partitions for a Low-Cost Reconfigurable Computing System," master's thesis, Baylor Univ., 2008.
[3] C. Boyd, "Data-Parallel Computing," Proc. ACM SIGGRAPH '08 Classes, pp. 1-10, 2008.
[4] W. Mark, "Future Graphics Architectures," Proc. ACM SIGGRAPH '08 Classes, pp. 1-11, 2008.
[5] A. Bayoumi, M. Chu, Y. Hanafy, P. Harrell, and G. Refai-Ahmed, "Scientific and Engineering Computing Using ATI Stream Technology," Computing in Science and Eng., vol. 11, no. 6, pp. 92-97, 2009.
[6] J.D. Owens, M. Houston, D. Luebke, S. Green, J.E. Stone, and J.C. Phillips, "GPU Computing," Proc. IEEE, vol. 96, no. 5, pp. 879-899, May 2008.
[7] D. Shires, S.J. Park, B. Henz, J. Clarke, L. Nguyen, and K. Kirk, "Asymmetric Core Computing for U.S. Army High-Performance Computing Applications," Technical Report ARL-TR-4788, U.S. Army Research Laboratory, Apr. 2009.
[8] M. Ressler, L. Nguyen, F. Koenig, D. Wong, and G. Smith, "The Army Research Laboratory (ARL) Synchronous Impulse Reconstruction (SIRE) Forward-Looking Radar," G.R. Gerhart, D.W. Gage, and C.M. Shoemaker, eds., Proc. SPIE Conf., p. 656105, 2007.
[9] T. Hartley, A. Fasih, C. Berdanier, F. Ozguner, and U. Catalyurek, "Investigating the Use of GPU-Accelerated Nodes for SAR Image Formation," Proc. IEEE Int'l Conf. Cluster Computing and Workshops (CLUSTER '09), pp. 1-8, Sept. 2009.
[10] N. GAC, S. Mancini, M. Desvignes, and D. Houzet, "High Speed 3D Tomography on CPU, GPU, and FPGA," EURASIP J. Embedded Systems, 2008.
[11] NVIDIA CUDA Compute Unified Device Architecture Programming Guide Version 2.0, NVIDIA, June 2008.
[12] L. Nguyen, D. Wong, M. Ressler, F. Koenig, B. Stanton, G. Smith, J. Sichina, and K. Kappra, "Obstacle Avoidance and Concealed Target Detection Using the Army Research Laboratory Ultra-Wideband Synchronous Impulse Reconstruction (UWB SIRE) Forward Imaging Radar," Proc. SPIE Conf., 2007.
[13] J. McCorkle and M. Rofheart, "An Order ${N}^2log({N})$ Backprojector Algorithm for Focusing Wide-Angle Wide-Bandwidth Arbitrary-Motion Synthetic Aperture Radar," Proc. SPIE Conf., pp. 25-36, 1996.
[14] B. Cordes and M. Leeser, "Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing," EURASIP J. Embedded Systems, 2009.
[15] L.M. Ulander, H. Hellsten, and G. Stenstrom, "Performance Analysis of Fast Backprojection for Synthetic Aperture Radar Processing," Proc. SPIE Conf., pp. 13-21, 2001.
[16] M. Blom and P. Follo, "VHF SAR Image Formation Implemented on a GPU," Proc. IEEE Int'l Geoscience and Remote Sensing Symp. (IGARSS '05), vol. 5, pp. 3352-3356, July 2005.
[17] L. Nguyen, "SAR Imaging Technique for Reduction of Sidelobes and Noise," K.I. Ranney and A.W. Doerry, eds., Proc. SPIE Conf., p. 73080U, 2009.
[18] K. Fatahalian and M. Houston, "GPUs: A Closer Look," Queue, vol. 6, no. 2, pp. 18-28, 2008.
[19] J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable Parallel Programming with CUDA," Queue, vol. 6, no. 2, pp. 40-53, 2008.
[20] ATI Stream Computing User Guide, Advanced Micro Devices, Inc., Apr. 2009.
[21] S. Carrillo, J. Siegel, and X. Li, "A Control-Structure Splitting Optimization for GPGPU," Proc. ACM Conf. Computing Frontiers, pp. 147-150, May 2009.
[22] "OpenCL: Parallel Computing for Heterogeneous Devices," overviewopencl_ overview.pdf, 2009.
39 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool