Issue No. 02 - February (2011 vol. 22)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2010.66
Gabriel Falcao , University of Coimbra, Coimbra
Leonel Sousa , Technical University of Lisbon, Lisbon
Vitor Silva , University of Coimbra, Coimbra
Unlike usual VLSI approaches necessary for the computation of intensive Low-Density Parity-Check (LDPC) code decoders, this paper presents flexible software-based LDPC decoders. Algorithms and data structures suitable for parallel computing are proposed in this paper to perform LDPC decoding on multicore architectures. To evaluate the efficiency of the proposed parallel algorithms, LDPC decoders were developed on recent multicores, such as off-the-shelf general-purpose x86 processors, Graphics Processing Units (GPUs), and the CELL Broadband Engine (CELL/B.E.). Challenging restrictions, such as memory access conflicts, latency, coalescence, or unknown behavior of thread and block schedulers, were unraveled and worked out. Experimental results for different code lengths show throughputs in the order of 1 \sim 2 Mbps on the general-purpose multicores, and ranging from 40 Mbps on the GPU to nearly 70 Mbps on the CELL/B.E. The analysis of the obtained results allows to conclude that the CELL/B.E. performs better for short to medium length codes, while the GPU achieves superior throughputs with larger codes. They achieve throughputs that in some cases approach very well those obtained with VLSI decoders. From the analysis of the results, we can predict a throughput increase with the rise of the number of cores.
LDPC, data-parallel computing, multicore, graphics processing units, GPU, CUDA, CELL, OpenMP.
V. Silva, G. Falcao and L. Sousa, "Massively LDPC Decoding on Multicore Architectures," in IEEE Transactions on Parallel & Distributed Systems, vol. 22, no. , pp. 309-322, 2010.