Issue No. 02 - April (1993 vol. 5)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/69.219740
<p>An efficient algorithm for performing multiple pattern match in a string is described. The match algorithm combines the concept of deterministic finite state automata (DFSA) and the Boyer-Moore algorithm to achieve better performance. Experimental results indicate that in the average case, the algorithm is able to perform pattern match operations sublinearly, i.e. it does not need to inspect every character of the string to perform pattern match operations. The analysis shows that the number of characters to be inspected decreases as the length of patterns increases, and increases slightly as the total number of patterns increases. To match an eight-character pattern in an English string using the algorithm, only about 17% of all characters of the strong and 33% of all characters of the string, when the number of patterns is seven, are inspected. In an actual testing, the algorithm running on SUN 3/160 takes only 3.7 s to search seven eight-character patterns in a 1.4-Mbyte English text file.</p>
multiple pattern match; deterministic finite state automata; Boyer-Moore algorithm; English string; SUN 3/160; eight-character patterns; English text file; finite automata; pattern recognition; word processing
K. Su and J. Fan, "An Efficient Algorithm for Matching Multiple Patterns," in IEEE Transactions on Knowledge & Data Engineering, vol. 5, no. , pp. 339-351, 1993.