This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Comparison of Standard Spell Checking Algorithms and a Novel Binary Neural Approach
September/October 2003 (vol. 15 no. 5)
pp. 1073-1081

Abstract—In this paper, we propose a simple, flexible, and efficient hybrid spell checking methodology based upon phonetic matching, supervised learning, and associative matching in the AURA neural system. We integrate Hamming Distance and n-gram algorithms that have high recall for typing errors and a phonetic spell-checking algorithm in a single novel architecture. Our approach is suitable for any spell checking application though aimed toward isolated word error correction, particularly spell checking user queries in a search engine. We use a novel scoring scheme to integrate the retrieved words from each spelling approach and calculate an overall score for each matched word. From the overall scores, we can rank the possible matches. In this paper, we evaluate our approach against several benchmark spellchecking algorithms for recall accuracy. Our proposed hybrid methodology has the highest recall rate of the techniques evaluated. The method has a high recall rate and low-computational cost.

[1] Web page:ftp://ftp.ox.ac.uk/pubwordlists, 2002.
[2] Aspell. Web page:http:/aspell.sourceforge.net/, 2002.
[3] J. Austin, Distributed Associative Memories for High Speed Symbolic Reasoning Proc. IJCAI '95 Working Notes of Workshop Connectionist-Symbolic Integration: From Unified to Hybrid Approaches, R. Sun and F. Alexandre, eds., pp. 87-93, Aug. 1995.
[4] F.J. Damerau, "A Technique for Computer Detection and Correction of Spelling Errors," Comm. ACM, vol. 7, no. 3, pp. 171-176, 1964.
[5] Elcom Ltd Password Recovery Software. Web page:http://www.elcomsoft.comprs.html, 2002.
[6] T. Gadd, PHONIX: The Algorithm Program, vol. 24, no. 4, pp. 363-366, 1990.
[7] V. Hodge and J. Austin, A Comparison of a Novel Spell Checker and Standard Spell Checking Algorithms Pattern Recognition, vol. 35, no. 11, pp. 2571-2580, 2002.
[8] V. Hodge and J. Austin, An Evaluation of Standard Retrieval Algorithms and a Binary Neural Approach Neural Networks, vol. 14, no. 3, 2001.
[9] V. Hodge and J. Austin, An Integrated Neural IR System Proc. Ninth European Symp. Artificial Neural Networks, Apr. 2001.
[10] K. Kukich, “Techniques for Automatically Correcting Words in Text,” ACM Computing Surveys, vol. 24, no. 4, pp. 377-439, 1992.
[11] Reuters-21578. The Reuters-21578, Distribution 1.0 test collection is available from David D. Lewis professional home page, currently:http://www.research.att.com~lewis, 2001.
[12] M. Turner and J. Austin, Matching Performance of Binary Correlation Matrix Memories Neural Networks, vol. 10, no. 9, pp. 1637-1648, 1997.
[13] J.R. Ullman, A Binary n-Gram Technique for Automatic Correction of Substitution, Deletion, Insertion, and Reversal Errors in Words Computer J., vol. 20, no. 2, pp. 141-147, May 1977.
[14] S. Wu and U. Manber, AGREP A Fast Approximate Pattern Matching Tool Proc. Usenix Winter 1992 Technical Conf., pp. 153-162, Jan. 1992.
[15] S. Wu and U. Manber, "Fast Text Searching," Comm. ACM, vol. 35, pp. 83-91, 1992.
[16] J. Zobel and P. Dart, Phonetic String Matching: Lessons from Information Retrieval Proc. 19th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, 1996.

Index Terms:
Binary neural spell checker, integrated modular spell checker, associative matching.
Citation:
Victoria J. Hodge, Jim Austin, "A Comparison of Standard Spell Checking Algorithms and a Novel Binary Neural Approach," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 5, pp. 1073-1081, Sept.-Oct. 2003, doi:10.1109/TKDE.2003.1232265
Usage of this product signifies your acceptance of the Terms of Use.