This Article 
 Bibliographic References 
 Add to: 
CASM: A VLSI Chip for Approximate String Matching
August 1995 (vol. 17 no. 8)
pp. 824-830

Abstract—The edit distance between two strings a1, …, am and b1, …, bn is the minimum cost s of a sequence of editing operations (insertions, deletions and substitutions) that convert one string into the other. This paper describes the design and implementation of a linear systolic array chip for computing the edit distance between two strings over a given alphabet. An encoding scheme is proposed which reduces the number of bits required to represent a state in the computation. The architecture is a parallel realization of the standard dynamic programming algorithm proposed by Wagner and Fischer, and can perform approximate string matching for variable edit costs. More importantly, the architecture does not place any constraint on the lengths of the strings that can be compared. It makes use of simple basic cells and requires regular nearest-neighbor communication, which makes it suitable for VLSI implementation. A prototype of this array has been built at the University of South Florida.

[1] K. Abe and N. Sugita,“Distances between strings of symbols—Review and remarks,” Proc. ICPR, pp. 172-174, 1982.
[2] R.A. Baeza-Yates and G.H. Gonnet, "A New Approach to Text Searching, Comm. ACM, vol. 35, no. 10, pp. 74-82, 1992.
[3] H. Bunke and A. Sanfeliu, eds., , Syntactic and Structural Pattern Recognition: Theory and Applications.Singapore: World Scientific Publishing Co., 1990.
[4] H.D. Cheng and K.S. Fu,“VLSI architectures for string matching and pattern matching,” Pattern Recognition, vol. 20, no. 1, pp. 125-141, 1987.
[5] D.T. Hoang,“Searching genetic databases on Splash 2,” Proc. IEEE Workshop FPGAs for Custom Computing Machines,Napa, Calif., 1993.
[6] R. Hughey and D.P. Lopresti, “Architecture of a Programmable Systolic Array,” Proc. Int'l Conf. Systolic Arrays, pp. 41-49, May 1988.
[7] R.J. Lipton and D. P. Lopresti,“A systolic array for rapid string comparison,” 1985 Chapel Hill Conf. on VLSI, H. Fuchs, ed., Rockville, Md.: Computer Science Press, pp. 363-376, 1985.
[8] R.J. Lipton and D. Lopresti,“Delta transformations to simplify VLSI processor arrays for serialdynamic programming,” Proc. ICPP, pp. 917-920, 1986.
[9] H.-C. Liu and M.D. Srinath,“Classification of partial shapes using string-to-string matching,” SPIE Proc. Intelligent Robots and Computer Vision, no. 1,002, pp. 92-98, 1989.
[10] D. Lopresti,“Rapid implementation of a genetic sequence comparator usingfield-programmable logic arrays,” Advanced Research in VLSI, pp. 138-152,Santa Cruz, Calif., 1991.
[11] D.P. Lopresti,“P-NAC: A systolic array for comparing nucleic acid sequences,” Computer, vol. 20, pp. 98-99, 1987.
[12] M. Maes, “Polygonal Shape Recognition Using String-Matching Techniques,” Pattern Recognition, vol. 24, no. 5, pp. 433-440, 1991.
[13] R. Sastry and N. Ranganathan,“A systolic array for approximate string matching,” Proc. Int’l Conf. Computer Design, pp. 402-405,Cambridge, Mass., 1993.
[14] K. Remedios,“A VLSI chip for approximate string matching using variable edit costs,” MS thesis, Dept. of Computer Science and Eng., Univ. of South Florida, 1993.
[15] E. Ukkonen,“Algorithms for approximate string matching,” Information and Control, vol. 64, pp. 100-118, 1985.
[16] R.A. Wagner and M.J. Fischer, "The String-to-String Correction Problem," J. ACM, vol. 21, no. 1, pp. 168-78, 1974.
[17] W.J. Wilbur and D.J. Lipman,“Rapid similarity searches of nucleic acid protein data banks,” Proc. Nat’l Academy of Sciences USA, vol. 80, pp. 726-730, 1983.
[18] S. Wu and U. Manber, "Fast Text Searching," Comm. ACM, vol. 35, pp. 83-91, 1992.

Index Terms:
Edit distance computation, string-to-string correction problem, very large scale integration (VLSI) implementation, systolic algorithm, special purpose architecture, hardware algorithm.
Raghu Sastry, N. Ranganathan, Klinton Remedios, "CASM: A VLSI Chip for Approximate String Matching," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 8, pp. 824-830, Aug. 1995, doi:10.1109/34.400575
Usage of this product signifies your acceptance of the Terms of Use.