Issue No. 11 - November (2011 vol. 60)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2010.250
Derek Pao , City University of Hong Kong, Hong Kong
Xing Wang , Peking University, Shenzhen
Xiaoran Wang , City University of Hong Kong, Hong Kong
Cong Cao , City University of Hong Kong, Hong Kong
Yuesheng Zhu , Peking University, Shenzhen
A memory-efficient hardware string searching engine for antivirus applications is presented. The proposed QSV method is based on quick sampling of the input stream against fixed-length pattern prefixes, and on-demand verification of variable-length pattern suffixes. Patterns handled by the QSV method are required to have at least 16 bytes, and possess distinct 16-byte prefixes. The latter requirement can be fulfilled by a preprocessing procedure. The search engine uses the pipelined Aho-Corasick (P-AC) architecture developed by the first author to process 4 to 15-byte short patterns and a small number of exception cases. Our design was evaluated using the ClamAV virus database having 82,888 strings with a total size that exceeds 8 Mbyte. In terms of byte count, 99.3 percent of the pattern set is handled by the QSV method and 0.7 percent of the pattern set is handled by P-AC. A pattern with distinct 16-byte prefix only occupies up to three lookup table entries in QSV. The overall memory cost of our system is about 1.4 Mbyte, i.e., 1.4 bit per character of the ClamAV pattern set. The proposed method is memory-based, hence, updates to the pattern set can be accommodated by modifying the contents of the lookup tables without reconfiguring the hardware circuits.
String searching, antivirus system, system security, embedded system.
Derek Pao, Xing Wang, Xiaoran Wang, Cong Cao, Yuesheng Zhu, "String Searching Engine for Virus Scanning", IEEE Transactions on Computers, vol. 60, no. , pp. 1596-1609, November 2011, doi:10.1109/TC.2010.250