A Model-based Sequence Similarity with Application to Handwritten Word-spotting PrePrint ISSN: 0162-8828
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.25
This article proposes a novel similarity measure between vector sequences. We work in the framework of model-based approaches, where each sequence is first mapped to a Hidden Markov Model (HMM) and then a probabilistic measure of similarity is computed between the HMMs. We propose to model sequences with semi-continuous HMMs (SC-HMMs). This is a particular type of HMM whose emission probabilities in each state are mixtures of shared Gaussians. This crucial constraint provides two major benefits. First, the a priori information contained in the common set of Gaussians leads to a more accurate estimate of the HMM parameters. Second, the computation of a probabilistic similarity between two SC-HMMs can be simplified to a Dynamic Time Warping (DTW) between their mixture weight vectors, which reduces significantly the computational cost. Experiments are carried out on a handwritten word retrieval task in three different datasets - an in-house dataset of real handwritten letters, the George Washington dataset and the IFN/ENIT dataset of Arabic handwritten words. These experiments show that the proposed similarity outperforms the traditional DTW between the original sequences, and the model-based approach which uses ordinary continuous HMMs. We also show that this increase in accuracy can be traded against a significant reduction of the computational cost.
Index Terms:
Handwriting analysis, Similarity measures
Citation:
Jose Rodriguez-Serrano, Florent Perronnin, "A Model-based Sequence Similarity with Application to Handwritten Word-spotting," IEEE Transactions on Pattern Analysis and Machine Intelligence, 12 Jan. 2012. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TPAMI.2012.25> Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||