The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - Nov. (2012 vol.34)
pp: 2108-2120
José A. Rodríguez-Serrano , Textual & Visual Pattern Anal. Group, Xerox Res. Centre Eur., Meylan, France
F. Perronnin , Textual & Visual Pattern Anal. Group, Xerox Res. Centre Eur., Meylan, France
ABSTRACT
This paper proposes a novel similarity measure between vector sequences. We work in the framework of model-based approaches, where each sequence is first mapped to a Hidden Markov Model (HMM) and then a measure of similarity is computed between the HMMs. We propose to model sequences with semicontinuous HMMs (SC-HMMs). This is a particular type of HMM whose emission probabilities in each state are mixtures of shared Gaussians. This crucial constraint provides two major benefits. First, the a priori information contained in the common set of Gaussians leads to a more accurate estimate of the HMM parameters. Second, the computation of a similarity between two SC-HMMs can be simplified to a Dynamic Time Warping (DTW) between their mixture weight vectors, which significantly reduces the computational cost. Experiments are carried out on a handwritten word retrieval task in three different datasets-an in-house dataset of real handwritten letters, the George Washington dataset, and the IFN/ENIT dataset of Arabic handwritten words. These experiments show that the proposed similarity outperforms the traditional DTW between the original sequences, and the model-based approach which uses ordinary continuous HMMs. We also show that this increase in accuracy can be traded against a significant reduction of the computational cost.
INDEX TERMS
Hidden Markov models, Vectors, Computational modeling, Visualization, Training, Feature extraction, Handwriting recognition, hidden Markov model, Handwriting recognition, word spotting, image retrieval
CITATION
José A. Rodríguez-Serrano, F. Perronnin, "A Model-Based Sequence Similarity with Application to Handwritten Word Spotting", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.34, no. 11, pp. 2108-2120, Nov. 2012, doi:10.1109/TPAMI.2012.25
REFERENCES
[1] R. Singh, B. Raj, and R. Stern, "Structured Redefinition of Sound Units by Merging and Splitting for Improved Speech Recognition," Proc. Int'l Conf. Spoken Language Processing, 2000.
[2] Q. Huo and W. Li, "A DTW-Based Dissimilarity Measure for Left-to-Right Hidden Markov Models and Its Application to Word Confusability Analysis," Proc. Int'l Conf. Spoken Language Processing, 2006.
[3] J. Hershey, P. Olsen, and S. Rennie, "Variational Kullback-Leibler Divergence for Hidden Markov Models," Proc. Workshop Automatic Speech Recognition Understanding, 2007.
[4] J.K. Kim and S. Choi, "Clustering Sequence Sets for Motif Discovery," Proc. Advances in Neural Information Processing Systems, 2009.
[5] M. Brand and V. Kettnaker, "Discovery and Segmentation of Activities in Video," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 844-851, Aug. 2000.
[6] C. Bahlmann and H. Burkhardt, "Measuring HMM Similarity with the Bayes Probability of Error and Its Application to Online Handwriting Recognition," Proc. Sixth Int'l Conf. Document Analysis and Recognition, 2001.
[7] U.-V. Marti and H. Bunke, "Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System," Int'l J. Pattern Recognition and Artificial Intelligence, vol. 15, pp. 69-90, 2001.
[8] A. Vinciarelli, S. Bengio, and H. Bunke, "Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 6, pp. 709-720, June 2004.
[9] J. Chan, C. Ziftci, and D. Forsyth, "Searching Off-Line Arabic Documents," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[10] R. Manmatha, C. Han, and E.M. Riseman, "Word Spotting: A New Approach to Indexing Handwriting," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1996.
[11] T.M. Rath and R. Manmatha, "Word Image Matching using Dynamic Time Warping," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[12] E. Saykol, A. Sinop, U. Gudukbay, O. Ulusoy, and A. Cetin, "Content-Based Retrieval of Historical Ottoman Documents Stored as Textual Images," IEEE Trans. Image Processing, vol. 13, no. 3, pp. 314-325, Mar. 2004.
[13] T. Rath and R. Manmatha, "Word Spotting for Historical Documents," Int'l J. Document Analysis and Recognition, vol. 9, pp. 139-152, 2007.
[14] T. Van der Zant, L. Schomaker, and K. Haak, "Handwritten-Word Spotting Using Biologically Inspired Features," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1945-1957, Nov. 2008.
[15] K. Terasawa and Y. Tanaka, "Locality Sensitive Pseudo-Code for Document Images," Proc. Int'l Conf. Document Analysis and Recognition, pp. 73-77, 2007.
[16] J. Edwards, Y.W. Teh, D.A. Forsyth, R. Bock, M. Maire, and G. Vesom, "Making Latin Manuscripts Searchable Using gHMMs," Proc. Neural Information Processing Systems, 2004.
[17] C. Choisy, "Dynamic Handwritten Keyword Spotting Based on the NSHP-HMM," Proc. Ninth Int'l Conf. Document Analysis and Recognition, pp. 242-246, 2007.
[18] A. Fischer, A. Keller, V. Frinken, and H. Bunke, "HMM-Based Word Spotting in Handwritten Documents Using Subword Models," Proc. Int'l Conf. Pattern Recognition, 2010.
[19] S. Srihari, H. Srinivasan, P. Babu, and C. Bhole, "Spotting Words in Handwritten Arabic Documents," Proc. Conf. Document Recognition and Retrieval, 2006.
[20] J.A. Rodríguez and F. Perronnin, "Local Gradient Histogram Features for Word Spotting in Unconstrained Handwritten Documents," Proc. Int'l Conf. Frontiers in Handwriting Recognition, 2008.
[21] T.M. Rath and R. Manmatha, "Features for Word Spotting in Historical Manuscripts," Proc. Int'l Conf. Document Analysis and Recognition, 2003.
[22] K. Terasawa, T. Nagasaki, and T. Kawashima, "Eigenspace Method for Text Retrieval in Historical Document Images," Proc. Int'l Conf. Document Analysis and Recognition, pp. 436-441, 2005.
[23] A. Kolcz, J. Alspector, M. Augusteijn, R. Carlson, and G.V. Popescu, "A Line-Oriented Approach to Word Spotting in Handwritten Documents," Pattern Analysis and Applications, vol. 3, no. 2, pp. 153-168, 2000.
[24] H. Sakoe and S. Chiba, "Dynamic Programming Algorithm Optimization for Spoken Word Recognition," IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 26, no. 1, pp. 43-49, Feb. 1978.
[25] J. Sivic and A. Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos," Proc. IEEE Int'l Conf. Computer Vision, 2003.
[26] J. Goldberger, S. Gordon, and H. Greenspan, "An Efficient Image Similarity Measure Based on Approximations of KL-Divergence between Two Gaussian Mixtures," Proc. IEEE Int'l Conf. Computer Vision, 2003.
[27] E. Ataer and P. Duygulu, "Matching Ottoman Words: An Image Retrieval Approach to Historical Document Indexing," Proc. ACM Int'l Conf. Image and Video Retrieval, 2007.
[28] T. Jebara, Y. Song, and K. Thadani, "Spectral Clustering and Embedding with Hidden Markov Models," Proc. European Conf. Machine Learning, 2007.
[29] T. Jebara, R. Kondor, and A. Howard, "Probability Product Kernels," J. Machine Learning Research, vol. 5, pp. 819-844, 2004.
[30] J.A. Rodríguez-Serrano, F. Perronnin, J. Lladós, and G. Sánchez, "A Similarity Measure between Vector Sequences with Application to Handwritten Word Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[31] L.R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
[32] X.D. Huang and M.A. Jack, "Semi-Continuous Hidden Markov Models for Speech Signals," Readings in Speech Recognition, Morgan Kaufmann Publishers, Inc., 1990.
[33] J.A. Bilmes, "A Gentle Tutorial of the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models," Technical Report TR-97-021, Int'l Computer Science Inst., 1998.
[34] S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, S. Povey, V. Valtchev, and P. Woodland, The HTK Book, version 3.2.1. Cambridge Univ. Eng. Dept., Dec. 2002.
[35] Y. Linde, A. Buzo, and R. Gray, "An Algorithm for Vector Quantizer Design," IEEE Trans. Comm., vol. 28, no. 1, pp. 84-95, Jan. 1980.
[36] J. Hershey and P. Olsen, "Variational Bhattacharyya Divergence for Hidden Markov Models," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2008.
[37] M. Pechwitz, S.S. Maddouri, V. Mrgner, N. Ellouze, and H. Amiri, "IFN/ENIT Database of Handwritten Arabic Words," Proc. Colloque Int'l Francophone sur l'Écrit et le Document, 2002.
[38] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, pp. 91-110, 2004.
[39] J.A. Rodríguez-Serrano and F. Perronnin, "Handwritten Word-Spotting Using Hidden Markov Models and Universal Vocabularies," Pattern Recognition, vol. 42, no. 9, pp. 2106-2116, 2009.
[40] F. Bashir, A. Khokhar, and D. Schonfeld, "Segmented Trajectory Based Indexing and Retrieval of Video Data," Proc. Int'l Conf. Image Processing, vol. 2, 2003.
[41] C. Stauffer and W.E.L. Grimson, "Learning Patterns of Activity Using Real-Time Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 747-757, Aug. 2000.
[42] C. Li and G. Biswas, "A Bayesian Approach to Temporal Data Clustering Using Hidden Markov Models," Proc. Int'l Conf. Machine Learning, pp. 543-550, 2000.
[43] P. Smyth, "Clustering Sequences with Hidden Markov Models," Proc. Neural Information Processing Systems, vol. 9, pp. 648-654, 1997.
[44] T. Izo and W.E.L. Grimson, "Unsupervised Modeling of Object Tracks for Fast Anomaly Detection," Proc. IEEE Int'l Conf. Image Processing, vol. 4, 2007.
[45] Z. Kim, G. Gomes, R. Hranac, and A. Skabardonis, "A Machine Vision System for Generating Vehicle Trajectories over Extended Freeway Segments," Proc. 12th World Congress Intelligent Transportation Systems, 2005.
[46] G. Sanguinetti, J. Laidler, and N. Lawrence, "Automatic Determination of the Number of Clusters Using Spectral Algorithms," Proc. IEEE Machine Learning for Signal Processing, 2005.
[47] L. Yang, B.K. Widjaja, and R. Prasad, "Application of Hidden Markov Models for Signature Verification," Pattern Recognition, vol. 28, no. 2, pp. 161-170, 1995.
[48] Y. Qiao, J. Liu, and X. Tang, "Offline Signature Verification Using Online Handwriting Registration," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[49] B. Kulis and K. Grauman, "Kernelized Locality-Sensitive Hashing for Scalable Image Search," Proc. IEEE Int'l Conf. Computer Vision, 2009.
34 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool