Issue No. 06 - June (2013 vol. 62)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2012.64
Jiafeng Xie , Central South University, Changsha
Pramod Kumar Meher , Institute for Infocomm Research, Singapore
Jianjun He , Central South University, Changsha
This paper presents an efficient decomposition scheme for hardware-efficient realization of discrete cosine transform (DCT) based on distributed arithmetic. We have proposed an efficient design for the implementation of cyclic convolution based on a group distributed arithmetic (GDA) technique where the read-only memory size could be reduced over the existing GDA-based design. The proposed structure for DCT implementation, based on the new decomposition scheme and proposed design of GDA-based cyclic convolution, involves significantly less area complexity than the existing one. For example, to implement the DCT of transform length $(N = 17)$, the proposed design needs a lookup table of 128 words, while the existing design for $(N = 16)$ requires a lookup table of 256 words. From the synthesis results, it is found that proposed design involves significantly less area, gives higher throughput, and consumes less power compared to the existing designs of nearly the same or lower lengths.
Discrete cosine transforms, Convolution, Sparse matrices, Matrix decomposition, Hardware, Read only memory, hardware efficient, Distributed arithmetic (DA), cyclic convolution, discrete cosine transform (DCT)
P. K. Meher, J. Xie and J. He, "Hardware-Efficient Realization of Prime-Length DCT Based on Distributed Arithmetic," in IEEE Transactions on Computers, vol. 62, no. , pp. 1170-1178, 2013.