This Article 
 Bibliographic References 
 Add to: 
Novel Table Lookup-Based Algorithms for High-Performance CRC Generation
November 2008 (vol. 57 no. 11)
pp. 1550-1560
Michael E. Kounavis, Intel Corporation, Hillsboro
Frank L. Berry, Intel Corporation, Hillsboro
A framework for designing a family of novel fast CRC generation algorithms is presented. Our algorithms can ideally read arbitrarily large amounts of data at a time, while optimizing their memory requirement to meet the constraints of specific computer architectures. In addition, our algorithms can be implemented in software using commodity processors instead of specialized parallel circuits. We use this framework to design two efficient algorithms that run in the popular Intel IA32 processor architecture. First, a 'slicing-by-4' algorithm doubles the performance of existing software-based, table-driven CRC implementations based on the Sarwate [12] algorithm while using a 4K cache footprint. Second, a 'slicing-by-8' algorithm triples the performance of existing software-based CRC implementations while using an 8K cache footprint. Whereas well-known software- based CRC implementations compute the current CRC value from a bit-stream reading 8 bits at a time, our algorithms read 32 and 64 bits at a time respectively. The slicing-by-8 source code is freely available for experimentation and can be found at:

[1] G. Albertengo and R. Sisto, “Parallel CRC Generation,” IEEE Micro, Oct. 1990.
[2] “Architectural Specifications for RDMA over TCP/IP,” RDMA Consortium Web Site, http:/, 2007.
[3] F. Braun and M. Waldvoger, “Fast Incremental CRC Updates for IP over ATM Networks,” Proc. IEEE Workshop High Performance Switching and Routing (HPSR), 2001.
[4] P. Culley, U. Elzur, R. Recio, S. Bailey, and J. Carrier, Marker MPU Aligned Framing for TCP Specification, Internet draft, work in progress, expired Jan. 2005, July 2004.
[5] A. Doering and M. Waldvogel, “Fast and Flexible CRC Calculation,” Electronics Letters, Jan. 2004.
[6] D. Feldmeier, “Fast Software Implementation of Error Detection Codes,” IEEE Trans. Networking, Dec. 1995.
[7] G. Griffiths and G.C. Stones, “The Tea-Leaf Reader Algorithm: An Efficient Implementation of CRC-16 and CRC-32,” Comm. ACM, vol. 30, no. 7, pp. 617-620, July 1987.
[8] C.M. Heard, “AAL2 CPS-PH HEC Calculations Using Table Lookups,”, 2007.
[9] S.M. Joshi, P.K. Dubey, and M.A. Kaplan, “A New Parallel Algorithm for CRC Generation,” Proc. IEEE Int'l Conf. Comm. (ICC), 2000.
[10] M.C. Nielson, “Method for High Speed CRC Computation,” IBM Technical Disclosure Bull., vol. 27, no. 6, pp. 3572-3576, Nov. 1984.
[11] T.V. Ramabadran and S.V. Gaitonde, “A Tutorial on CRC Computations,” IEEE Micro, vol. 8, no. 4, pp. 62-75, Aug. 1988.
[12] D.V. Sarwate, “Computation of Cyclic Redundancy Checks via Table Lookup,” Comm. ACM, vol. 31, no. 8, pp. 1008-1013, Aug. 1988.
[13] J. Satran, K. Methm, C. Sapuntzakis, M. Chadalapaka, and E. Zeidner, Internet Small Computer Systems Interface (iSCSI), RFC3720, Apr. 2004.
[14] M.D. Shieh, M.H. Sheu, C.H. Chen, and H.F. Lo, “A Systematic Approach for Parallel CRC Computations,” J. Information Science and Eng., vol. 17, pp. 445-461, 2001.
[15] A. Perez, “Byte-Wise CRC Calculations,” IEEE Micro, vol. 3, no. 3, pp. 40-50, June 1983.
[16] R.N. Williams, “A Painless Guide to CRC Error Detection Algorithms,” technical report, http://www.ross.netcrc, Aug. 1993.
[17] G. Castagnoli, S. Brauer, and M. Herrmann, “Optimization of Cyclic Redundancy Check Codes with 24 and 32 Parity Bits,” IEEE Trans. Comm., vol. 41, no. 6, pp. 883-892, 1993.

Index Terms:
Error handling and recovery, Mathematical Software, Data communications, Network Protocols
Michael E. Kounavis, Frank L. Berry, "Novel Table Lookup-Based Algorithms for High-Performance CRC Generation," IEEE Transactions on Computers, vol. 57, no. 11, pp. 1550-1560, Nov. 2008, doi:10.1109/TC.2008.85
Usage of this product signifies your acceptance of the Terms of Use.