The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2014 vol.25)
pp: 663-672
Yue Zhao , Hong Kong Polytechnic University, Hong Kong
Francis C.M. Lau , Hong Kong Polytechnic University, Hong Kong
ABSTRACT
In this paper, efficient LDPC block-code decoders/simulators which run on graphics processing units (GPUs) are proposed. We also implement the decoder for the LDPC convolutional code (LDPCCC). The LDPCCC is derived from a predesigned quasi-cyclic LDPC block code with good error performance. Compared to the decoder based on the randomly constructed LDPCCC code, the complexity of the proposed LDPCCC decoder is reduced due to the periodicity of the derived LDPCCC and the properties of the quasi-cyclic structure. In our proposed decoder architecture, $(\Gamma)$ ($(\Gamma)$ is a multiple of a warp) codewords are decoded together, and hence, the messages of $(\Gamma)$ codewords are also processed together. Since all the $(\Gamma)$ codewords share the same Tanner graph, messages of the $(\Gamma)$ distinct codewords corresponding to the same edge can be grouped into one package and stored linearly. By optimizing the data structures of the messages used in the decoding process, both the read and write processes can be performed in a highly parallel manner by the GPUs. In addition, a thread hierarchy minimizing the divergence of the threads is deployed, and it can maximize the efficiency of the parallel execution. With the use of a large number of cores in the GPU to perform the simple computations simultaneously, our GPU-based LDPC decoder can obtain hundreds of times speedup compared with a serial CPU-based simulator and over 40 times speedup compared with an eight-thread CPU-based simulator.
INDEX TERMS
Decoding, Message systems, Convolutional codes, Graphics processing units, Block codes, Iterative decoding,LDPCCC decoder, LDPC, LDPC convolutional code, CUDA, graphics processing unit (GPU), OpenMP, parallel computing, LDPC decoder
CITATION
Yue Zhao, Francis C.M. Lau, "Implementation of Decoders for LDPC Block Codes and LDPC Convolutional Codes Based on GPUs", IEEE Transactions on Parallel & Distributed Systems, vol.25, no. 3, pp. 663-672, March 2014, doi:10.1109/TPDS.2013.52
REFERENCES
[1] R.G. Gallager, Low-Density Parity-Check Codes. MIT Press, Sept. 1963.
[2] D. MacKay, "Good Error-Correcting Codes Based on Very Sparse Matrices," IEEE Trans. Information Theory, vol. 45, no. 2, pp. 399-431, Mar. 1999.
[3] I. Djordjevic, M. Cvijetic, L. Xu, and T. Wang, "Using LDPC-Coded Modulation and Coherent Detection for Ultra Highspeed Optical Transmission," J. Lightwave Technology, vol. 25, no. 11, pp. 3619-3625, Nov. 2007.
[4] Y. Miyata, K. Sugihara, W. Matsumoto, K. Onohara, T. Sugihara, K. Kubo, H. Yoshida, and T. Mizuochi, "A Triple-Concatenated FEC Using Soft-Decision Decoding for 100 Gb/s Optical Transmission," Proc. Optical Fiber Comm., Collocated Nat'l Fiber Optic Engineers Conf., (OFC/NFOEC '10), pp. 1-3, 2010.
[5] Y. Chen and D. Hocevar, "A FPGA and ASIC Implementation of Rate 1/2, 8088-b Irregular Low Density Parity Check Decoder," Proc. IEEE Global Telecomm. Conf. (GLOBECOM '03), vol. 1, pp. 113-117, 2003.
[6] I.B. Djordjevic, M. Arabaci, and L.L. Minkov, "Next Generation FEC for High-Capacity Communication in Optical Transport Networks," J. Lightwave Technology, vol. 27, no. 16, pp. 3518-3530, Aug. 2009.
[7] B. Levine, R.R. Taylor, and H. Schmit, "Implementation of Near Shannon Limit Error-Correcting Codes Using Reconfigurable Hardware," Proc. IEEE Symp. Field-Programmable Custom Computing Machines, 2000.
[8] A. Pusane, A. Feltstrom, A. Sridharan, M. Lentmaier, K. Zigangirov, and D. Costello, "Implementation Aspects of LDPC Convolutional Codes," IEEE Trans. Comm., vol. 56, no. 7, pp. 1060-1069, July 2008.
[9] S. Bates, Z. Chen, L. Gunthorpe, A. Pusane, K. Zigangirov, and D. Costello, "A Low-Cost Serial Decoder Architecture for Low-Density Parity-Check Convolutional Codes," IEEE Trans. Circuits and Systems I: Regular Papers, vol. 55, no. 7, pp. 1967-1976, Aug. 2008.
[10] Z. Chen, S. Bates, and W. Krzymien, "High Throughput Parallel Decoder Design for LDPC Convolutional Codes," Proc. Fourth IEEE Int'l Conf. Circuits and Systems for Comm. (ICCSC '08), pp. 35-39, May 2008.
[11] R. Swamy, S. Bates, and T. Brandon, "Architectures for ASIC Implementations of Low-Density Parity-Check Convolutional Encoders and Decoders," Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS '05), pp. 4513-4516, 2005.
[12] C.W. Sham, X. Chen, F.C.M. Lau, Y. Zhao, and W.M. Tam, "A 2.0 Gb/s Throughput Decoder for QC-LDPC Convolutional Codes," IEEE Trans. Circuits and Systems I: Regular Papers, vol. 60, no. 7, pp. 1857-1869, July 2013.
[13] G. Falcao, V. Silva, and L. Sousa, "How GPUs Can Outperform ASICs for Fast LDPC Decoding," Proc. 23rd Int'l Conf. Supercomputing, pp. 390-399, 2009.
[14] H. Ji, J. Cho, and W. Sung, "Massively Parallel Implementation of Cyclic LDPC Codes on a General Purpose Graphics Processing Unit," Proc. IEEE Workshop Signal Processing Systems (SiPS '09), pp. 285-290, 2009.
[15] H. Ji, J. Cho, and W. Sung, "Memory Access Optimized Implementation of Cyclic and Quasi-Cyclic LDPC Codes on a GPGPU," J. Signal Processing Systems, vol. 3, pp. 1-11, 2010.
[16] G. Falcao, L. Sousa, and V. Silva, "Massive Parallel LDPC Decoding on GPU," Proc. 13th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 83-90, 2008.
[17] G. Falcao, L. Sousa, and V. Silva, "Massively LDPC Decoding on Multicore Architectures," IEEE Trans. Parallel and Distributed Systems, vol. 22, no. 2, pp. 309-322, Feb. 2011.
[18] A.J. Felstrom and K. Zigangirov, "Time-Varying Periodic Convolutional Codes with Low-Density Parity-Check Matrix," IEEE Trans. Information Theory, vol. 45, no. 6, pp. 2181-2191, Sept. 1999.
[19] M. Tavares, E. Matus, S. Kunze, and G. Fettweis, "A Dual-Core Programmable Decoder for LDPC Convolutional Codes," Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS '08), pp. 532-535, May 2008.
[20] E. Matus, M. Tavares, M. Bimberg, and G. Fettweis, "Towards a GBit/s Programmable Decoder for LDPC Convolutional Codes," Proc. IEEE Int'l Symp. Circuits and Systems (ISCAS '07), pp. 1657-1660, May 2007.
[21] R. Tanner, "A Recursive Approach to Low Complexity Codes," IEEE Trans. Information Theory, vol. IT-27, no. 5, pp. 533-547, Sept. 1981.
[22] M. Fossorier, "Quasicyclic Low-Density Parity-Check Codes from Circulant Permutation Matrices," IEEE Trans. Information Theory, vol. 50, no. 8, pp. 1788-1793, Aug. 2004.
[23] W.M. Tam, F.C.M. Lau, and C.K. Tse, "A Class of QC-LDPC Codes with Low Encoding Complexity and Good Error Performance," IEEE Comm. Letters, vol. 14, no. 2, pp. 169-171, Feb. 2010.
[24] T. Richardson, M. Shokrollahi, and R. Urbanke, "Design of Capacity-Approaching Irregular Low-Density Parity-Check Codes," IEEE Trans. Information Theory, vol. 47, no. 2, pp. 619-637, Feb. 2001.
[25] T. Richardson and R. Urbanke, "Efficient Encoding of Low-Density Parity-Check Codes," IEEE Trans. Information Theory, vol. 47, no. 2, pp. 638-656, Feb. 2001.
[26] X. Hu, E. Eleftheriou, D. Arnold, and A. Dholakia, "Efficient Implementations of the Sum-Product Algorithm for Decoding LDPC Codes," Proc. IEEE Global Telecomm. Conf. (GLOBECOM '01), vol. 2, pp. 1036-1036E, 2001.
[27] J. Chen, A. Dholakia, E. Eleftheriou, M. Fossorier, and X. Hu, "Reduced-Complexity Decoding of LDPC Codes," IEEE Trans. Comm., vol. 53, no. 8, pp. 1288-1299, Aug. 2005.
[28] R. Tanner, D. Sridhara, A. Sridharan, T. Fuja, and D. Costello, "LDPC Block and Convolutional Codes Based on Circulant Matrices," IEEE Trans. Information Theory, vol. 50, no. 12, pp. 2966-2984, Dec. 2004.
[29] A.E. Pusane, R. Smarandache, P.O. Vontobel, and D.J. Costello, "On Deriving Good LDPC Convolutional Codes from QC-LDPC Block Codes," Proc. IEEE Int. Symp. Information Theory (ISIT '07), pp. 1221-1225, 2007.
[30] R. Chandra, Parallel Programming in OpenMP. Morgan Kaufmann, 2001.
[31] M. Lentmaier, D.G.M. Mitchell, G.P. Fettweis, and D.J. Costello, "Asymptotically Regular LDPC Codes with Linear Distance Growth and Thresholds Close to Capacity," Proc. Information Theory and Applications Workshop (ITA '10), pp. 1-8, 2010.
[32] C. Nvidia, "Compute Unified Device Architecture Programming Guide Version 4.0," technical report, NVIDIA Corporation, 2011.
[33] W. Nvidia, N. Generation, and C. Compute, "Whitepaper Nvidia's Next Generation CUDA Compute Architecture," ReVision, pp. 1-22, 2009.
[34] F. Kschischang, B. Frey, and H. Loeliger, "Factor Graphs and the Sum-Product Algorithm," IEEE Trans. Information Theory, vol. 47, no. 2, pp. 498-519, Feb. 2001.
38 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool