The Community for Technology Leaders
2014 23rd International Conference on Parallel Architecture and Compilation (PACT) (2014)
Edmonton, Canada
Aug. 23, 2014 to Aug. 27, 2014
ISBN: 978-1-5090-6607-0
pp: 139-150
Robert D. Cameron , School of Computing Science, Simon Fraser University, Surrey, British Columbia
Thomas C. Shermer , School of Computing Science, Simon Fraser University, Surrey, British Columbia
Arrvindh Shriraman , School of Computing Science, Simon Fraser University, Surrey, British Columbia
Kenneth S. Herdy , School of Computing Science, Simon Fraser University, Surrey, British Columbia
Dan Lin , School of Computing Science, Simon Fraser University, Surrey, British Columbia
Benjamin R. Hull , School of Computing Science, Simon Fraser University, Surrey, British Columbia
Meng Lin , School of Computing Science, Simon Fraser University, Surrey, British Columbia
ABSTRACT
A new parallel algorithm for regular expression matching is developed and applied to the classical grep (global regular expression print) problem. Building on the bitwise data parallelism previously applied to the manual implementation of token scanning in the Parabix XML parser, the new algorithm represents a general solution to the problem of regular expression matching using parallel bit streams. On widely-deployed commodity hardware using 128-bit SSE2 SIMD technology, our algorithm implementations can substantially outperform traditional grep implementations based on NFAs, DFAs or backtracking. 5× or better performance advantage against the best of available competitors is not atypical. The algorithms are also designed to scale with the availability of additional parallel resources such as the wider SIMD facilities (256-bit) of Intel AVX2 or future 512bit extensions. Our AVX2 implementation showed dramatic reduction in instruction count and significant improvement in speed. Our GPU implementations show further acceleration.
INDEX TERMS
Pattern matching, Parallel processing, Graphics processing units, Hardware, Computer architecture, Throughput, Microprocessors
CITATION
Robert D. Cameron, Thomas C. Shermer, Arrvindh Shriraman, Kenneth S. Herdy, Dan Lin, Benjamin R. Hull, Meng Lin, "Bitwise data parallelism in regular expression matching", 2014 23rd International Conference on Parallel Architecture and Compilation (PACT), vol. 00, no. , pp. 139-150, 2014, doi:10.1145/2628071.2628079
97 ms
(Ver 3.3 (11022016))