The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - February (2012 vol.34)
pp: 211-224
V. Frinken , Inst. of Comput. Sci. & Appl. Math. (IAM), Univ. of Bern, Bern, Switzerland
A. Fischer , Inst. of Comput. Sci. & Appl. Math. (IAM), Univ. of Bern, Bern, Switzerland
R. Manmatha , Dept. of Comput. Sci., Univ. of Massachusetts, Amherst, MA, USA
H. Bunke , Inst. of Comput. Sci. & Appl. Math. (IAM), Univ. of Bern, Bern, Switzerland
ABSTRACT
Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm in conjunction with a recurrent neural network. We demonstrate that the proposed systems outperform not only a classical dynamic time warping-based approach but also a modern keyword spotting system, based on hidden Markov models. Furthermore, we analyze the performance of the underlying neural networks when using them in a recognition task followed by keyword spotting on the produced transcription. We point out the advantages of keyword spotting when compared to classic text line recognition.
INDEX TERMS
recurrent neural nets, document image processing, handwriting recognition, hidden Markov models, text line recognition, novel word spotting method, recurrent neural networks, keyword spotting, handwritten documents, handwriting recognition, CTC token passing algorithm, hidden Markov models, Hidden Markov models, Artificial neural networks, Feature extraction, Indexes, Handwriting recognition, Image segmentation, Neural networks, Documentation, BLSTM., Index TermsKeyword spotting, offline handwriting, document analysis, historical documents, neural network
CITATION
V. Frinken, A. Fischer, R. Manmatha, H. Bunke, "A Novel Word Spotting Method Based on Recurrent Neural Networks", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 2, pp. 211-224, February 2012, doi:10.1109/TPAMI.2011.113
REFERENCES
[1] A. Vinciarelli, "A Survey on Off-Line Cursive Word Recognition," Pattern Recognition, vol. 35, no. 7, pp. 1433-1446, 2002.
[2] R. Plamondon and S.N. Srihari, "On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63-84, Jan. 2000.
[3] C. Choisy, "Dynamic Handwritten Keyword Spotting Based on the NSHP-HMM," Proc. Ninth Int'l Conf. Document Analysis and Recognition, pp. 242-246, 2007.
[4] T.M. Rath and R. Manmatha, "Word Spotting for Historical Documents," Int'l J. Document Analysis and Recognition, vol. 9, pp. 139-152, 2007.
[5] Y. Leydier, A. Ouji, F. LeBourgeois, and H. Emptoz, "Towards an Omnilingual Word Retrieval System for Ancient Manuscripts," Pattern Recognition, vol. 42, no. 9, pp. 2089-2105, 2009.
[6] K. Khurshid, C. Faure, and N. Vincent, "Fusion of Word Spotting and Spatial Information for Figure Caption Retrival in Historical Document Images," Proc. 10th Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 266-270, 2009.
[7] S. Levy, "Google's Two Revolutions," Newsweek, http://www.msnbc.msn.com/id/6733225/site newsweek/, Dec./Jan. 2004.
[8] S.-S. Kuo and O.E. Agazzi, "Keyword Spotting in Poorly Printed Documents Using Pseudo 2-D Hidden Markov Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 8, pp. 842-848, Aug. 1994.
[9] R. Manmatha, C. Han, and E. Riseman, "Word Spotting: A New Approach to Indexing Handwriting," Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 631-637, 1996.
[10] R. Manmatha and W.B. Croft, Word Spotting: Indexing Handwritten Archives, ch. 3, pp. 43-64. MIT Press, 1997.
[11] R. Manmatha and T.M. Rath, "Indexing of Handwritten Historical Documents—Recent Progress," Proc. Symp. Document Image Understanding Technology, pp. 77-85, 2003.
[12] Y. Lu and C.L. Tan, "Word Spotting in Chinese Document Images without Layout Analysis," Proc. 16th Int'l Conf. Pattern Recognition, pp. 57-60, 2002.
[13] J. Rothfeder, S. Feng, and T.M. Rath, "Using Corner Feature Correspondences to Rank Word Images by Similarity," Proc. Workshop Document Image Analysis and Retrieval, p. 30, 2003.
[14] A. Bhardwaj, D. Jose, and V. Govindaraju, "Script Independent Word Spotting in Multilingual Documents," Proc. Second Int'l Workshop Cross Lingual Information Access, pp. 48-54, 2008.
[15] B. Zhang, S.N. Srihari, and C. Huang, "Word Image Retrieval Using Binary Features," Proc. SPIE, vol. 5296, pp. 45-53, 2004.
[16] S. Srihari, H. Srinivasan, P. Babu, and C. Bhole, "Spotting Words in Handwritten Arabic Documents," Document Recognition and Retrieval XIII: Proc. SPIE, vol. 6067, pp. 606702-1-606702-12, 2006.
[17] Y. Leydier, F. Lebourgeois, and H. Emptoz, "Text Search for Medieval Manuscript Images," Pattern Recognition, vol. 40, pp. 3552-3567, 2007.
[18] T. Adamek, N.E. Connor, and A.F. Smeaton, "Word Matching Using Single Closed Contours for Indexing Historical Documents," J. Document Analysis and Recognition, vol. 9, no. 2, pp. 153-165, 2007.
[19] H. Cao and V. Govindaraju, "Template-Free Word Spotting in Low-Quality Manuscripts," Proc. Sixth Int'l Conf. Advances in Pattern Recognition, 2007.
[20] T.M. Rath, R. Manmatha, and V. Lavrenko, "A Search Engine for Historical Manuscript Images," Proc. 27th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 369-376, 2004.
[21] T.M. Rath and R. Manmatha, "Word Image Matching Using Dynamic Time Warping," Computer Vision and Pattern Recognition, vol. 2, pp. 521-527, 2003.
[22] K. Terasawa and Y. Tanaka, "Slit Style HOG Features for Document Image Word Spotting," Proc. 10th Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 116-120, 2009.
[23] S.N. Srihari, H. Srinivasan, C. Huang, and S. Shetty, "Spotting Words in Latin, Devanagari and Arabic Scripts," Indian J. Artificial Intelligence, vol. 16, no. 3, pp. 2-9, 2006.
[24] J. Kesheta, D. Grangierb, and S. Bengioc, "Discriminative Keyword Spotting," Speech Comm., vol. 51, no. 4, pp. 317-329, http://www.sciencedirect.com/science/article/ B6V1C-4TPHRJ2-1/2170cdbfefa41f6916f1d1d7aa1e70c55 , 2009.
[25] A. Kołcz, J. Alspector, M.F. Augusteijn, R. Carlson, and G.V. Popescu, "A Line-Oriented Approach to Word Spotting in Handwritten Documents," Pattern Analysis and Applications, vol. 3, pp. 153-168, 2000.
[26] H. Cao, A. Bhardwaj, and V. Govindaraju, "A Probabilistic Method for Keyword Retrieval in Handwritten Document Images," Pattern Recognition, vol. 42, no. 12, pp. 3374-3382, http://dx.doi.org/10.1016j.patcog.2009.02.003 , Dec. 2009.
[27] V. Lavrenko, T.M. Rath, and R. Manmatha, "Holistic Word Recognition for Handwritten Historical Documents," Proc. Int'l Workshop Document Image Analysis for Libraries, pp. 278-287, 2004.
[28] J. Chan, C. Ziftci, and D. Forsyth, "Searching Off-Line Arabic Documents," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1455-1462, 2006.
[29] J.A. Rodríguez and F. Perronnin, "Local Gradient Histrogram Features for Word Spotting in Unconstrained Handwritten Documents," Proc. 11th Int'l Conf. Frontiers in Handwriting Recognition, pp. 7-12, 2008.
[30] A. Fischer, A. Keller, V. Frinken, and H. Bunke, "HMM-Based Word Spotting in Handwritten Documents Using Subword Models," Proc. 20th Int'l Conf. Pattern Recognition, pp. 3416-3419, 2010.
[31] J. Edwards, Y. Whye, T. David, F. Roger, B.M. Maire, and G. Vesom, "Making Latin Manuscripts Searchable Using gHMM's," Advances in Neural Information Processing Systems 17, pp. 385-392, MIT Press, 2004.
[32] J.A. Rodríguez, F. Perronnin, G. Sánchez, and J. Lladós, "Unsupervised Writer Style Adaptation for Handwritten Word Spotting," Proc. 19th Int'l Conf. Pattern Recognition, pp. 1-4, 2008.
[33] F. Perronnin and J. Rodriguez-Serrano, "Fisher Kernels for Handwritten Word-Spotting," Proc. 10th Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 106-110, 2009.
[34] S. Fernández, A. Graves, and J. Schmidhuber, "An Application of Recurrent Neural Networks to Discriminative Keyword Spotting," Proc. 17th Int'l Conf. Artificial Neural Networks, pp. 220-229, 2007.
[35] M. Wollmer, F. Eyben, J. Keshet, A. Graves, B. Schuller, and G. Rigoll, "Robust Discriminative Keyword Spotting for Emotionally Colored Spontaneous Speech Using Bidirectional LSTM Networks," Proc. IEEE Int'l Conf. Acustics, Speech, and Signal Processing, pp. 3949-3952, 2009.
[36] E. Saykol, A.K. Sinop, U. Güdükbay, O. Ulusoy, and A.E. Cetin, "Content-Based Retrieval of Historical Ottoman Documents Stored as Textual Images," IEEE Trans. Image Processing, vol. 13, no. 3, pp. 314-325, Mar. 2004.
[37] R.F. Moghaddam and M. Cheriet, "Application on Multi-Level Classifier and Clustering for Automatic Word Spotting in Historical Document Images," Proc. 10th Int'l Conf. Document Analysis and Recognition, vol. 2, pp. 511-515, 2009.
[38] Y. Leydier, F.L. Bourgois, and H. Emptoz, "Omnilingual Segmentation-Free Word Spotting for Ancient Manuscripts Indexation," Proc. Eighth Int'l Conf. Document Analysis and Recognition, pp. 533-537, 2005.
[39] B. Gatos and I. Pratikakis, "Segmentation-Free Word Spotting in Historical Printed Documents," Proc. 10th Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 271-275, 2009.
[40] A. Graves, M. Liwicki, S. Fernández, R. Bertolami, H. Bunke, and J. Schmidhuber, "A Novel Connectionist System for Unconstrained Handwriting Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 5, pp. 855-868, May 2009.
[41] N.R. Howe, T.M. Rath, and R. Manmatha, "Boosted Decision Trees for Word Recognition in Handwritten Document Retrieval," Proc. 28th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 377-383, 2005.
[42] D. Metzler and W.B. Croft, "A Markov Random Field Model for Term Dependencies," Proc. 28th Ann. ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 472-479, 2005.
[43] V. Frinken, A. Fischer, and H. Bunke, "A Novel Word Spotting Algorithm Using Bidirectional Long Short-Term Memory Neural Networks," Proc. Fourth Workshop Artificial Neural Networks in Pattern Recognition, pp. 185-196, 2010.
[44] V. Frinken, A. Fischer, R. Manmatha, and H. Bunke, "Adapting BLSTM Neural Network Based Keyword Spotting Trained on Modern Data to Historical Documents," Proc. 10th Int'l Conf. Frontiers in Handwriting Recognition, pp. 352-257, 2010.
[45] A. Fischer, M. Wüthrich, M. Liwicki, V. Frinken, H. Bunke, G. Viehhauser, and M. Stolz, "Automatic Transcription of Handwritten Medieval Documents," Proc. 15th Int'l Conf. Virtual Systems and Multimedia, pp. 137-142, 2009.
[46] U.-V. Marti and H. Bunke, "Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System," Int'l J. Pattern Recognition and Artificial Intelligence, vol. 15, pp. 65-90, 2001.
[47] J.A. Rodríguez-Serrano and F. Perronnin, "Handwritten Word-Spotting Using Hidden Markov Models and Universal Vocabularies," Pattern Recognition, vol. 42, no. 9, pp. 2106-2116, 2009.
[48] A.E.R. Cory, S. Myers, and L.R. Rabiner, "An Investigation of the Use of Dynamic Time Warping for Word Spotting and Connected Speech Recognition," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 173-177, 1980.
[49] T.M. Rath and R. Manmatha, "Features for Word Spotting in Historical Manuscripts," Proc. Seventh Int'l Conf. Document Analysis and Recognition, pp. 218-222, 2003.
[50] H. Sakoe and S. Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43-49, Feb. 1978.
[51] T. Ploetz and G.A. Fink, "Markov Models for Offline Handwriting Recognition: A Survey," Int'l J. Document Analysis and Recognition, vol. 12, no. 12, pp. 269-298, 2009.
[52] M.A. El-Yacoubi, M. Gilloux, and J.-M. Bertille, "A Statistical Approach for Phrase Location and Recognition within a Text Line: An Application to Street Name Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 2, pp. 172-188, Feb. 2002.
[53] S. Thomas, C. Chatelain, L. Heutte, and T. Paquet, "An Information Extraction Model for Unconstrained Handwritten Documents," Proc. 20th Int'l Conf. Pattern Recognition, pp. 3412-3415, 2010.
[54] H. Kucera and W.N. Francis, Manual of Information to Accompany a Standard Corpus of Present-Day Edited American English, for Use with Digital Computers. Brown Univ., Dept. of Linguistics, 1989.
[55] J.T. Goodman, "A Bit of Progress in Language Modeling— Extended Version," Technical Report MSR-TR-2001-72, Microsoft Research, 2001.
[56] A. Stolke, "SRILM—An Extensible Language Modeling Toolkit," Proc. Int'l Conf. Spoken Language Processing, pp. 901-904, 2002.
[57] U.-V. Marti and H. Bunke, "The IAM-Database: An English Sentence Database for Offline Handwriting Recognition," Int'l J. Document Analysis and Recognition, vol. 5, pp. 39-46, 2002.
[58] G. Salton, The SMART Retrieval System—Experiments in Automatic Document Processing. Prentice-Hall, Inc., 1971.
[59] T.M. Rath, V. Lavrenko, and R. Manmatha, "A Statistical Approach to Retrieving Historical Manuscript Images without Recognition," Technical Report MM-42, Center for Intelligent Information Retrival, 2003.
[60] S. Feng, "Statistical Models for Text Query-Based Image Retrieval," PhD dissertation, Univ. of Massachusetts, May 2008.
[61] N.R. Howe, S. Feng, and R. Manmatha, "Finding Words in Alphabet Soup: Inference on Freeform Character Recognition for Historical Scripts," Pattern Recognition, vol. 42, no. 12, pp. 3338-3347, Dec. 2009.
[62] M. Bulacu, R. van Koert, L. Schomaker, and T. van der Zant, "Layout Analysis of Historical Documents for Searching the Archives of the Cabinet of the Dutch Queen," Proc. Ninth Int'l Conf. Document Analysis and Recognition, pp. 367-361, 2007.
24 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool