This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Fast Computation of Normalized Edit Distances
September 1995 (vol. 17 no. 9)
pp. 899-902

Abstract—The Normalized Edit Distance (NED) between two strings X and Y is defined as the minimum quotient between the sum of weights of the edit operations required to transform X into Y and the length of the editing path corresponding to these operations. An algorithm for computing the NED has recently been introduced by Marzal and Vidal that exhibits O(mn2) computing complexity, where m and n are the lengths of X and Y. We propose here an algorithm that is observed to require in practice the same O(mn) computing resources as the conventional unnormalized Edit Distance algorithm does. The performance of this algorithm is illustrated through computational experiments with synthetic data, as well as with real data consisting of OCR chain-coded strings.

[1] J. Di Martino,“Dynamic time warping algorithms for isolated and connected word recognition,” New Systems and Architectures for Automatic Speech Recognition and Synthesis, DeMori and Suen, eds., Springer Verlag, 1985.
[2] W. Dinkelbach,“On nonlinear fractional programming,” Management Science, vol. 18, no. 7, pp. 492-498, Mar. 1967.
[3] P.S. Gopalakrishnan, D. Kanevsky, A. Nadas, and D. Nahamoo, "An Inequality for Rational Functions With Applications to Some Statistical Estimation Problems," IEEE Trans. Information Theory, vol. 37, no. 1, pp. 107-113, 1991.
[4] Y. Kitazume,E. Ohira,, and T. Endo,“LSI implementation of a pattern matching algorithm for speech recognition,” IEEE Proc. on Acoustics, Speech and Signal Processing, vol. 33, no. 1, pp. 1-5, 1985.
[5] A. Marzal and E. Vidal, "Computation of Normalized Edit Distance and Applications," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 15, pp. 926-932, 1993.
[6] D.S. Pallett,“Test procedures for the March 1987 DARPA benchmark tests,” Proc. DARPA Speech Recognition Workshop, pp. 75-78, 1987.
[7] F. Prat,P. Aibar,A. Marzal,, and E. Vidal,“El problema de la evaluación de un sistema de reconocimiento automático del habla mediante unúnico valor numérico,” Tech. Report DSIC-II/15/94 (in Spanish), Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, 1994.
[8] L. Rabiner and S.E. Levinson,“Isolated and connected word recognition—Theory and applications,” IEEE Trans. Comm., vol. 29, pp. 621-659, 1981.
[9] H. Rulot and E. Vidal,“Modeling (sub)string-length-based constraints througha grammatical inference method,” Pattern Recognition: Theory and Applications, Devijuer and Kittler, eds., Springer Verlag, pp. 451-459, 1987.
[10] H. Sakoe and S. Chiba, "Dynamic Programming Optimization for Spoken Word Recognition," IEEE Trans. ASSP, vol. 26, pp. 623-625, 1980.
[11] M. Sniedovich,Dynamic Programming, Marcel Dekker, 1992.
[12] R.A. Wagner and M.J. Fischer, "The String-to-String Correction Problem," J. ACM, vol. 21, no. 1, pp. 168-78, 1974.

Index Terms:
Normalized edit distance, Levenshtein distance, pattern recognition, string correction, editing, spelling correction, optical character recognition, speech recognition, fractional programming, fast algorithms.
Citation:
Enrique Vidal, Andrés Marzal, Pablo Aibar, "Fast Computation of Normalized Edit Distances," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 9, pp. 899-902, Sept. 1995, doi:10.1109/34.406656
Usage of this product signifies your acceptance of the Terms of Use.