The Community for Technology Leaders
RSS Icon
Issue No.06 - June (2013 vol.62)
pp: 1156-1169
Xinyan Zha , University of Florida, Gainesville
Sartaj Sahni , University of Florida, Gainesville
We develop GPU adaptations of the Aho-Corasick and multipattern Boyer-Moore string matching algorithms for the two cases GPU-to-GPU (input to the algorithms is initially in GPU memory and the output is left in GPU memory) and host-to-host (input and output are in the memory of the host CPU). For the GPU-to-GPU case, we consider several refinements to a base GPU implementation and measure the performance gain from each refinement. For the host-to-host case, we analyze two strategies to communicate between the host and the GPU and show that one is optimal with respect to runtime while the other requires less device memory. This analysis is done for GPUs with one I/O channel to the host as well as those with 2. Experiments conducted on an NVIDIA Tesla GT200 GPU that has 240 cores running off of a Xeon 2.8 GHz quad-core host CPU show that, for the GPU-to-GPU case, our Aho-Corasick GPU adaptation achieves a speedup between 8.5 and 9.5 relative to a single-thread CPU implementation and between 2.4 and 3.2 relative to the best multithreaded implementation. For the host-to-host case, the GPU AC code achieves a speedup of 3.1 relative to a single-threaded CPU implementation. However, the GPU is unable to deliver any speedup relative to the best multithreaded code running on the quad-core host. In fact, the measured speedups for the latter case ranged between 0.74 and 0.83. Early versions of our multipattern Boyer-Moore adaptations ran 7 to 10 percent slower than corresponding versions of the AC adaptations and we did not refine the multipattern Boyer-Moore codes further.
Graphics processing unit, Instruction sets, Pattern matching, Doped fiber amplifiers, Arrays, Bandwidth, Dictionaries, CUDA, Multipattern string matching, Aho-Corasick, multipattern Boyer-Moore, GPU
Xinyan Zha, Sartaj Sahni, "GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU", IEEE Transactions on Computers, vol.62, no. 6, pp. 1156-1169, June 2013, doi:10.1109/TC.2012.61
[1] A. Aho and M. Corasick, "Efficient String Matching: An Aid to Bibliographic Search," Comm. ACM, vol. 18, no. 6, pp. 333-340, 1975.
[2] R. Baeza-Yates, "Improved String Searching," Software-Practice and Experience, vol. 19, pp. 257-271, 1989.
[3] R. Baeza-Yates and G. Gonnet, "A New Approach to Text Searching," Comm. ACM, vol. 35, no. 10, pp. 74-82, 1992.
[4] R. Boyer and J. Moore, "A Fast String Searching Algorithm," Comm. ACM, vol. 20, no. 10, pp. 262-272, 1977.
[5] S. Che et al., "A Performance Study of General-Purpose Applications on Graphics Processors Using CUDA," J. Parallel and Distributed Computing, vol. 68, pp. 1370-1380, 2008.
[6] B. Commentz-Walter, "A String Matching Algorithm Fast on the Average," Proc. Sixth Int'l Colloquium Automata, Languages and Programming (ICALP), pp. 118-132, 1979.
[7] M. Crochemore et al., "Speeding up Two String Matching Algorithms," Algorithmica, vol. 12, pp. 247-267, 1994.
[8] M. Crochemore et al., "Fast Practical Multi-Pattern Matching," Information Processing Letters, vol. 71, pp. 107-113, 1999.
[9] M. Crochemore and W. Rytter, Text Algorithms. Oxford Univ. Press, 1994.
[10] , 2012.
[11] M. Fisk and G. Varghese, "Applying Fast String Matching to Intrusion Detection," Los Alamos Nat'l Lab NM, 2002.
[12] Z. Galil, "On Improving the Worst Case Running Time of Boyer-Moore String Matching Algorithm," Proc. Int'l Colloquium Automata, Languages and Programming (ICALP), 1978.
[13] N. Horspool, "Practical Fast Searching in Strings," Software-Practice and Experience, vol. 10, pp. 501-506, 1980.
[14] N. Huang, H. Hung, and S. Lai, "A GPU-Based Multiple-Pattern Matching Algorithm for Network Intrusion Detection Systems," Proc. 22nd Int'l Conf. Advanced Information Networking and Applications, 2008.
[15] N. Jacob and C. Brodley, "Offloading IDS Computation to the GPU," Proc. 22nd Ann. Computer Security Applications Conf., 2006.
[16] D.E. Knuth, J.H. MorrisJr., and V.R. Pratt, "Fast Pattern Matching in Strings," SIAM J. Computing, vol. 6, 323-350, 1977.
[17] C. Lin et al., "Accelerating String Matching Using Multi-Threaded Algorithm on GPU," Proc. IEEE Globecom, 2010.
[18] L. Marziale, G. RichardIII, and V. Roussev, "Massive Threading: Using GPUs to Increase the Performance of Digit Forensics Tools," Science Direct, vol. 4, pp. 73-81, 2007.
[19] G. Navarro and K. Frederiksson, "Average Complexity of Exact and Approximate Multiple String Matching," Theoretical Computer Science, vol. 321, pp. 283-290, 2004.
[20] A. Pal and N. Memon, "The Evolution of File Carving," IEEE Signal Processing Magazine, vol. 26, no. 2, pp. 59-72, Mar. 2009.
[21] "PFAC: A Library for String Matching on NVIDIA GPUs,", 2011.
[22] G. RichardIII and V. Roussev, "Scalpel: A Frugal, High Performance File Carver," Proc. Digital Forensics Research Workshop, 2005.
[23] S. Sahni, "Scheduling Master-Slave Multiprocessor Systems," IEEE Trans. Computers, vol. 45, no. 10, pp. 1195-1199, Oct. 1996.
[24] Scalpel/, 2012.
[25] D. Scarpazza, O. Villa, and F. Petrini, "Peak-Performance DFA-Based String Matching on the Cell Processor," Proc. Int'l Workshop System Management Techniques, Processes, and Services, 2007.
[26] D. Scarpazza, O. Villa, and F. Petrini, "Accelerating Real-Time String Searching with Multicore Processors," Computer, vol. 41, no. 4, pp. 42-50, Apr. 2008.
[27] R. Smith et al., "Evaluating GPUs for Network Packet Signature Matching," Proc. IEEE Int'l Symp. Performance Analysis of Systems and Software (ISPASS), 2009.
[28] http://www.snort.orgdl, 2012.
[29] http://www.lostcircuits.comgraphics, 2012.
[30] A. Tumeo, S. Seechi, and O. Villa, "Experiences with String Matching on the Fermi Architecture," Proc. 24th Int'l Conf. Architecture of Computing Systems (ARCS), pp. 26-37, 2011.
[31] G. Vasiliadis et al., "Regular Expression Matching on Graphics Hardware for Intrusion Detection," Proc. 12th Int'l Symp. Recent Advances in Intrusion Detection (ISRAID), 2009.
[32] Y. Won and S. Sahni, "A Balanced Bin Sort for Hypercube Multicomputers," J. Supercomputing, vol. 2, pp. 435-448, 1988.
[33] Y. Won and S. Sahni, "Hypercube-to-Host Sorting," J. Supercomputing, vol. 3, pp. 41-61, 1989.
[34] Y. Won and S. Sahni, "Host-to-Hypercube Sorting," Computer Systems: Science and Eng., vol. 4, no. 3, pp. 161-168, 1989.
[35] S. Wu and U. Manber, "Agrep-A Fast Algorithm for Multi-Pattern Searching," technical report, Univ. of Arizona, 1994.
[36] A. Yao, "The Complexity of Pattern Matching for a Random String," SIAM J. Computing, vol. 8, pp. 368-387, 1979.
[37] X. Zha, D. Scarpazza, and S. Sahni, "Highly Compressed Multi-Pattern String Matching on the Cell Broadband Engine," Proc. IEEE Symp. Computers and Comm. (ISCC), 2011.
[38] X. Zha and S. Sahni, "Fast in-Place File Carving for Digital Forensics," Proc. e-Forensics, pp. 141-158, 2010.
[39] X. Zha and S. Sahni, "Multipattern String Matching on a GPU," Proc. IEEE Symp. Computers and Comm. (ISCC), 2011.
27 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool