This Article 
 Bibliographic References 
 Add to: 
Computation of Normalized Edit Distance and Applications
September 1993 (vol. 15 no. 9)
pp. 926-932

Given two strings X and Y over a finite alphabet, the normalized edit distance between X and Y, d(X,Y) is defined as the minimum of W(P)/L(P), where P is an editing path between X and Y, W(P) is the sum of the weights of the elementary edit operations of P, and L(P) is the number of these operations (length of P). It is shown that in general, d(X,Y) cannot be computed by first obtaining the conventional (unnormalized) edit distance between X and Y and then normalizing this value by the length of the corresponding editing path. In order to compute normalized edit distances, an algorithm that can be implemented to work in O(m*n/sup 2/) time and O(n/sup 2/) memory space is proposed, where m and n are the lengths of the strings under consideration. Experiments in hand-written digit recognition are presented, revealing that the normalized edit distance consistently provides better results than both unnormalized or post-normalized classical edit distances.

[1] F. Casacuberta and E. Vidal,Reconocimiento Automático del Habla. Barcelona: Marcombo, 1987.
[2] J. Di Martino, "Dynamic time warping algorithms for isolated and connected word recognition," inNew Systems and Architectures for Automatic Speech Recognition and Synthesis(R. De Mori and Y. Suen, Eds.). Berlin: Springer Verlag, 1985.
[3] K. S. Fu,Syntactic Pattern Recognition and Applications. Englewood Cliffs, NJ: Prentice-Hall, 1982.
[4] P. A. V. Hall and G. R. Dowling, "Approximate string matching,"ACM Comput. Surveys, vol. 12, pp. 381-402, 1980.
[5] Y. Kitazume, E. Ohira, and T. Endo, "LSI implementation of a pattern matching algorithm for speech recognition,"IEEE Trans. Acoustics Speech Signal Processing, vol. 33, no. 1, pp. 1-5, Feb. 1985.
[6] A. Marzal and E. Vidal, "On the computation of normalized edit distances revisited," Tech. Rep. DSIC-II/15/1991, Depto. de Sistemas Informáticos y Computación, Univ. Politécnica de Valencia.
[7] W. J. Masek and M. S. Patterson, "A faster algorithm computing string edit distances,"J. Comput. Syst. Sci., vol. 20, pp. 18-31, Feb. 1980.
[8] L. Rabiner and L. Levinson, "Isolated and connected word recognition--Theory and selected applications,"IEEE Trans. Commun., vol. C-29, no. 5, pp. 621-659, 1981.
[9] H. Rulot and E. Vidal, "Modeling (sub)string-length based constraints through a grammatical inference method," in NATO ASI Series,Pattern Recognition Theory and Applications(P. A. Devijver and J. Kittler, eds.), New York: Springer-Verlag, 1987, vol. F30.
[10] D. Sankoff and J. B. Kruskal,Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Reading, MA: Addison-Wesley, 1983.
[11] P. H. Sellers, "The theory and computation of evolutionary distances: Pattern recognition,"J. Algorithms, vol. 1, pp. 359-373, 1980.
[12] E. Vidal, F. Casacuberta, J. M. Benedi, M. J. Lloret, and H. Rulot, "On the verification of triangle inequality by dynamic time-warping dissimilarity measures,"Speech Commun., vol. 7, pp. 67-69, 1988.
[13] R. Wagner and M. Fischer, "The string-to-string correction problem,"J. ACM, vol. 21, pp. 168-173, 1974.
[14] Y. P. Yang and T. Pavlidis, "Optimal correspondence of string subsequences,"IEEE Trans. Patt. Anal. Machine Intell., vol. 12, no. 11, pp. 1080-1087, Nov. 1990.

Index Terms:
character strings; words; normalized edit distance; finite alphabet; hand-written digit recognition; computational complexity; pattern recognition
A. Marzal, E. Vidal, "Computation of Normalized Edit Distance and Applications," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 9, pp. 926-932, Sept. 1993, doi:10.1109/34.232078
Usage of this product signifies your acceptance of the Terms of Use.