The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2013 vol.62)
pp: 844-857
Hoang Le , University of Southern California, Los Angeles
Viktor K. Prasanna , University of Southern California, Los Angeles
ABSTRACT
In Network Intrusion Detection Systems (NIDSs), string pattern matching demands exceptionally high performance to match the content of network traffic against a predefined database (or dictionary) of malicious patterns. Much work has been done in this field; however, most of the prior work results in low memory efficiency (defined as the ratio of the amount of the required storage in bytes and the size of the dictionary in number of characters). Due to such inefficiency, state-of-the-art designs cannot support large dictionaries without using high-latency external DRAM. We propose an algorithm called "leaf-attaching" to preprocess a given dictionary without increasing the number of patterns. The resulting set of postprocessed patterns can be searched using any tree-search data structure. We also present a scalable, high-throughput, Memory-efficient Architecture for large-scale String Matching (MASM) based on a pipelined binary search tree. The proposed algorithm and architecture achieve a memory efficiency of 0.56 (for the Rogets dictionary) and 1.32 (for the Snort dictionary). As a result, our design scales well to support larger dictionaries. Implementations on 45 nm ASIC and a state-of-the-art FPGA device (for latest Rogets and Snort dictionaries) show that our architecture achieves 24 and 3.2 Gbps, respectively. The MASM module can simply be duplicated to accept multiple characters per cycle, leading to scalable throughput with respect to the number of characters processed in each cycle. Dictionary update involves simply rewriting the content of the memory, which can be done quickly without reconfiguring the chip.
INDEX TERMS
Dictionaries, Pattern matching, Vectors, Throughput, Databases, Memory management, Snort, Dictionaries, Pattern matching, Vectors, Throughput, Databases, Memory management, Rogets, String matching, reconfigurable, field-programmable gate array (FPGA), ASIC, pipeline, leaf attaching, Aho-Corasick, DFA
CITATION
Hoang Le, Viktor K. Prasanna, "A Memory-Efficient and Modular Approach for Large-Scale String Pattern Matching", IEEE Transactions on Computers, vol.62, no. 5, pp. 844-857, May 2013, doi:10.1109/TC.2012.38
REFERENCES
[1] System Hacking General Password Crackers Wordlists, http://packetstormsecurity.org/files/31989 roget-dictionary.gz.html, 2012.
[2] The Open Source Network Intrusion Detection System, http:/www.snort.org, 2012.
[3] A.V. Aho and M.J. Corasick, "Efficient String Matching: An Aid to Bibliographic Search," Comm. ACM, vol. 18, no. 6, pp. 333-340, 1975.
[4] Z.K. Baker and V.K. Prasanna, "A Methodology for Synthesis of Efficient Intrusion Detection Systems on Fpgas," FCCM '04: Proc. 12th Ann. IEEE Symp. Field-Programmable Custom Computing Machines, pp. 135-144, 2004.
[5] A. Basu and G. Narlikar, "Fast Incremental Updates for Pipelined Forwarding Engines," Proc. IEEE INFOCOM '03, pp. 64-74, 2003.
[6] CACTI Tool, http://quid.hpl.hp.com:9081cacti/, 2012.
[7] C.R. Clark and D.E. Schimmel, "Scalable Pattern Matching for High Speed Networks," FCCM '04: Proc. 12th Ann. IEEE Symp. Field-Programmable Custom Computing Machines, pp. 249-257, 2004.
[8] CompleteBST, http://xlinux.nist.gov/dads/HTML completeBinaryTree.html , 2012.
[9] P. Gupta and N. McKeown, "Algorithms for Packet Classification," IEEE Network, vol. 15, no. 2, pp. 24-32, Mar/Apr. 2001.
[10] N. Hua, H. Song, and T.V. Lakshman, "Variable-Stride Multi-Pattern Matching for Scalable Deep Packet Inspection," Proc. IEEE INFOCOM '09, Apr. 2009.
[11] H.-J. Jung, Z. Baker, and V. Prasanna, "Performance of FPGA Implementation of Bit-Split Architecture for Intrusion Detection Systems," Proc. Int'l Parallel and Distributed Processing Symp., p. 177, 2006.
[12] H. Le and V. Prasanna, "A Memory-Efficient and Modular Approach for String Matching on FPGAs," Proc. Ann. IEEE Int'l Symp. Field Programmable Custom Computing Machines (FCCM '10), pp. 193-200, May 2010.
[13] Prefix Tree, http://en.wikipedia.org/wikiTrie, 2012.
[14] I. Sourdis and D. Pnevmatikatos, "Fast, Large-Scale String Match for a 10 Gbps FPGA-Based Network Intrusion Detection System," Proc. Int'l Conf. Field Programmable Logic and Applications (FPL), vol. 2003, pp. 880-889, 2003.
[15] V. Srinivasan and G. Varghese, "Fast Address Lookups Using Controlled Prefix Expansion," ACM Trans. Computer Systems, vol. 17, pp. 1-40, 1999.
[16] L. Tan, B. Brotherton, and T. Sherwood, "Bit-Split String-Matching Engines for Intrusion Detection and Prevention," ACM Trans. Architecture and Code Optimization, vol. 3, no. 1, pp. 3-34, 2006.
[17] L. Tan and T. Sherwood, "A High Throughput String Matching Architecture for Intrusion Detection and Prevention," ISCA '05: Proc. 32nd Ann. Int'l Symp. Computer Architecture, pp. 112-122, 2005.
[18] Y.-H.E. Yang and V.K. Prasanna, "Memory-Efficient Pipelined Architecture for Large-Scale String Matching," Proc. IEEE Symp. Field Programmable Custom Computing Machines (FCCM '09), pp. 104-111, Apr. 2009.
[19] F. Yu, R.H. Katz, and T.V. Lakshman, "Gigabit Rate Packet Pattern-Matching Using TCAM," ICNP '04: Proc. 12th IEEE Int'l Conf. Network Protocols, pp. 174-183, 2004.
58 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool