The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2013 vol.24)
pp: 92-103
Hao Wang , Texas A&M University, College Station
Shi Pu , Texas A&M University, College Station
Gabe Knezek , Texas A&M University, College Station
Jyh-Charn Liu , Texas A&M University, College Station
ABSTRACT
We propose an NFA-based algorithm called MIN-MAX to support matching of regular expressions (regexp) composed of Character Classes with Constraint Repetitions (CCR). MIN-MAX is well suited for massive parallel processing architectures, such as FPGAs, yet it is effective on any other computing platform. In MIN-MAX, each active CCR engine (to implement one CCR term) evaluates input characters, updates (MIN, MAX) counters, and asserts control signals, and all the CCR engines implemented in the FPGA run simultaneously. Unlike traditional designs, (MIN, MAX) counters contain dynamically updated lower and upper bounds of possible matching counts, instead of actual matching counts, so that feasible matching lengths are compactly enclosed in the counter value. The counter-based design can support constraint repetitions of n using O({\rm log} n) memory bits rather than that of O(n) in existing solutions. MIN-MAX can resolve character class ambiguity between adjacent CCR terms and support overlapped matching when matching collisions are absent. We developed a set of heuristic rules to assess the absence of collision for CCR-based regexps, and tested them on Snort and SpamAssassin rule sets. The results show that the vast majority of rules are immune from collisions, so that MIN-MAX can cost effectively support overlapped matching. As a bonus, the new architecture also supports fast reconfiguration via ordinary memory writes rather than resynthesis of the entire design, which is critical for time-sensitive regexp deployment scenarios.
INDEX TERMS
Radiation detectors, Engines, Doped fiber amplifiers, Field programmable gate arrays, Computer architecture, Registers, Algorithm design and analysis, reconfigurable hardware, Nondeterministic Finite Automata, algorithm design and analysis
CITATION
Hao Wang, Shi Pu, Gabe Knezek, Jyh-Charn Liu, "MIN-MAX: A Counter-Based Algorithm for Regular Expression Matching", IEEE Transactions on Parallel & Distributed Systems, vol.24, no. 1, pp. 92-103, Jan. 2013, doi:10.1109/TPDS.2012.116
REFERENCES
[1] "Perl Compatible Regular Expression," http:/www.pcre.org, 2011.
[2] "POSIX Basic and Extended Regular Expressions," http://www.regular-expressions.infoposix.html , 2011.
[3] Snort, http:/www.snort.org/, 2011.
[4] Bro, http:/www.bro-ids.org/, 2011.
[5] "Application Layer Packet Classifier for Linux," http:/l7-filter. sourceforge.net/. 2011.
[6] "SpamAssassin," http:/www.spamassassin.org/, 2011.
[7] E. Berk and C. Ananian, "JLex: A Lexical Analyzer Generator for Java," http://www.cs.princeton.edu/~appel/modern/ javaJLex/. 2011.
[8] S. Pu, C.-C. Tan, and J.-C. Liu, "SA2PX: A Tool to Translate SpamAssassin Regular Expression Rules to POSIX," Proc. Sixth Conf. Email and Anti-Spam, 2009.
[9] H. Wang, S. Pu, G. Kneze, and J.-C. Liu, "A Modular NFA Architecture for Regular Expression Matching," Proc. Int'l Symp. Field Programmable Gate Arrays, 2010.
[10] R. Sidhu and V.K. Prasanna, "Fast Regular Expression Using FPGAs," Proc. IEEE Symp. Field-Programmable Custom Computing Machines, 2001.
[11] C.R. Clark and D.E. Schimmel, "Scalable Pattern Matching for High Speed Networks," Proc. IEEE Symp. Field-Programmable Custom Computing Machines, 2004.
[12] A. Mitra, W. Najjar, and L. Bhuyan, "Compiling PCRE to FPGA for Acceleration SNORT IDS," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2007.
[13] C.-H. Lin, C.-T. Huang, C.-P. Jiang, and S.-C. Chang, "Optimization of Regular Expression Pattern Matching Circuits on FPGA," Proc. Design, Automation and Test in Europe, 2006.
[14] I. Sourdis, J. Bispo, J.M.P. Cardoso, and S. Vassiliadis, "Regular Expression Matching in Reconfigurable Hardware," Int'l J. Signal Processing Systems for Signal, Image, and Video Technology, vol. 51, no. 1, pp. 99-121, 2007.
[15] M. Faezipour and M. Nourani, "Constraint Repetition Inspection for Regular Expression on FPGA," Proc. IEEE Symp. High Performance Interconnects, 2008.
[16] Y.-H.E. Yang and V. Prasanna, "Automatic Construction of Large-Scale Regular Expression Matching Engines on FPGA," Proc. Int'l Conf. Reconfigurable Computing and FPGAs, 2008.
[17] J. Moscola, Y.H. Cho, and J.W. Lockwood, "A Scalable Hybrid Regular Expression Pattern Matcher," Proc. IEEE Symp. Field-Programmable Custom Computing Machines, 2006.
[18] J. Moscola, J. Lockwood, R.P. Loui, and M. Pachos, "Implementation of a Content-Scanning Module for an Internet Firewall," Proc. IEEE Symp. Field-Programming Custom Computing Machines, 2003.
[19] Z.K. Baker, H.-J. Jung, and V.K. Prasanna, "Regular Expression Software Deceleration For Intrusion Detection Systems," Proc. Int'l Conf. Field Programmable Logic and Applications, 2006.
[20] C.L. Hayes and Y. Luo, "DPICO: A High Speed Deep Inspection Engine Using Compact Finite Automata," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2007.
[21] B.C. Brodie, D.E. Taylor, and R.K. Cytron, "A Scalable Architecture for High-Throughput Regular Expression Pattern Matching," Proc. ACM/IEEE Int'l Symp. Computer Architecture, 2006.
[22] I. Bonesana, M. Paolieri, and M.D. Santambrogio, "An Adaptable FPGA-Based System for Regular Expression Matching," Proc. Design, Automation and Test in Europe, 2008.
[23] R. Franklin, D. Carver, and B. Hutchings, "Assisting Network Intrusion Detection with Reconfigurable Hardware," Proc. IEEE Symp. Field-Programmable Custom Computing Machines, 2002.
[24] J.E. Hopcroft, R. Motwani, and J.D. Ullman, Introduction to Automata Theory Languages, and Computation, second ed. Pearson Education, 2000.
[25] F. Yu, Z. Chen, Y. Diao, T.V. Lakshman, and R.H. Katz, "Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2006.
[26] S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner, "Algorithm to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection," Proc. ACM SIGCOMM, 2006.
[27] S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese, "Curing Regular Expressions Matching Algorithms from Insomnia, Amnesia, and Acalculia," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2007.
[28] R. Smith, C. Estan, S. Jha, and S. Kong, "Deflating the Big Bang: Fast and Scalable Deep Packet Inspection with Extended Finite Automata," Proc. ACM SIGCOMM, 2008.
[29] M. Becchi and S. Cadambi, "Memory-Efficient Regular Expression Search Using State Merging," Proc. IEEE INFOCOM, 2007.
[30] D. Ficara, S. Giordano, G. Procissi, F. Vitucci, G. Antichi, and A.D. Pietro, "An Improved DFA for Fast Regular Expression Matching," ACM SIGCOMM Computer Comm. Rev., vol 38, no. 5, pp. 29-40, 2008.
[31] M. Becchi and P. Crowley, "Efficient Regular Expression Evaluation: Theory to Practice," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2008.
[32] Y.-H. Yang, W. Jiang, and V.K. Prasanna, "Compact Architecture for High-Throughput Regular Expression Matching on FPGA," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2008.
[33] D. Pao, "A NFA-Based Programmable Regular Expression Match Engine," Proc. ACM/IEEE Symp. Architectures for Networking and Comm. Systems, 2009.
[34] Wireshark, http:/www.wireshark.org/, 2011.
32 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool