This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Hidden Tree Markov Models for Document Image Classification
April 2003 (vol. 25 no. 4)
pp. 519-523

Abstract—Classification is an important problem in image document processing and is often a preliminary step toward recognition, understanding, and information extraction. In this paper, the problem is formulated in the framework of concept learning and each category corresponds to the set of image documents with similar physical structure. We propose a solution based on two algorithmic ideas. First, we obtain a structured representation of images based on labeled XY-trees (this representation informs the learner about important relationships between image subconstituents). Second, we propose a probabilistic architecture that extends hidden Markov models for learning probability distributions defined on spaces of labeled trees. Finally, a successful application of this method to the categorization of commercial invoices is presented.

[1] A. Appiani, F. Cesarini, A. Colla, M. Diligenti, M. Gori, S. Marinai, and G. Soda, “Automatic Document Classification and Indexing in High-Volume Applications,” Int'l J. Document Analysis and Recognition, vol. 4, no. 2, pp. 69-83, 2002.
[2] Y. Bengio and P. Frasconi, “An Input Output HMM Architecture,” Advances in Neural Information Processing Systems, G. Tesauro, D. Touretzky, and T. Leen, eds., vol. 7, pp. 427-434, MIT Press, 1995.
[3] R. Brugger, A. Zramdini, and R. Ingold, “Modeling Documents for Structure Recognition Using Generalized N-Grams,” Proc. Int'l Conf. Document Analysis and Recognition, 1997.
[4] F. Cesarini, M. Gori, S. Marinai, and G. Soda, “Structured Document Segmentation and Representation by the Modified X-Y Tree,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 563-566, Sept. 1999.
[5] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum-Likelihood from Incomplete Data Via the EM Algorithm,” J. Royal Statistical Soc. B, vol. 39, pp. 1-38, 1977.
[6] A. Dengel, "Initial Learning of Document Structure," Proc. Second Int'l Conf. Document Analysis and Recognition, pp. 86-90,Tsukuba, Japan, 1993.
[7] A. Dengel and F. Dubiel, “Clustering and Classification of Document Structure: A Machine Learning Approach,” Proc. Int'l Conf. Document Analysis and Recognition, pp. 587-591, 1995.
[8] U.M. Fayyad and K.B. Irani, “Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning,” Proc. 13th Int'l Joint Conf. Artificial Intelligence, pp. 1022-1027, Morgan Kaufmann, 1993.
[9] P. Frasconi, M. Gori, and A. Sperduti, “A General Framework for Adaptive Processing of Data Structures,” IEEE Trans. Neural Networks, vol. 9, no. 5, pp. 768–786, 1998.
[10] R.C. Gonzalez and M.G. Thomason, Syntactic Pattern Recognition. Reading, Mass.: Addison Wesley, 1978.
[11] D. Heckermann, “Bayesian Networks for Data Mining,” Data Mining and Knowledge Discovery, vol. 1, pp. 79–119, 1997.
[12] F.V. Jensen, S.L. Lauritzen, and K.G. Olosen, “Bayesian Updating in Recursive Graphical Models by Local Computations,” Computational Statistical Quarterly, vol. 4, pp. 269-282, 1990.
[13] F. Jensen, An Introduction to Bayesian Neworks. Springer Verlag, 1996.
[14] M.I. Jordan, Z. Ghahramani, and L.K. Saul, “Hidden Markov Decision Trees,” Advances in Neural Information Processing Systems, M.C. Mozer, M.I. Jordan, and T. Petsche, eds., MIT Press, p. 501, 1997.
[15] H. Lucke, Bayesian Belief Networks as a Tool for Stochastic Parsing Speech Comm., vol. 16, pp. 89-118, 1995.
[16] G. Nagy and M. Viswanathan, “Dual Representation of Segmented and Technical Documents,” Proc. First Int'l Conf. Document Analysis and Recognition, pp. 141-151, 1991.
[17] G. Nagy and S. Seth, “Hierarchical Representation of Optically Scanned Documents,” Proc. Int'l Conf. Pattern Recognition, pp. 347-349, 1984.
[18] J. Pearl, Probabilistic Reasoning in Intelligent Systems. San Mateo, Calif.: Morgan Kaufman, 1988.
[19] C. Shin and D. Doermann, “Classification of Document Page Images Based on Visual Similarity of Layout Structures,” Proc. SPIE Conf. Document Recognition and Retrieval VII 3967, pp. 182-190, 2000.
[20] C. Shin, D. Doermann, and A. Rosenfeld, “Classification of Document Pages Using Structure-Based Features,” Int'l J. Document Analysis and Recognition, vol. 3, no. 4, pp. 232-247, 2001.
[21] P. Smyth, D. Heckerman, and M.I. Jordan, “Probabilistic Independence Networks for Hidden Markov Probability Models,” Neural Computation, vol. 9, no. 2, pp. 227-269, 1997.

Index Terms:
Document classification, machine learning, Markovian models, structured information.
Citation:
Michelangelo Diligenti, Paolo Frasconi, Marco Gori, "Hidden Tree Markov Models for Document Image Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 4, pp. 519-523, April 2003, doi:10.1109/TPAMI.2003.1190578
Usage of this product signifies your acceptance of the Terms of Use.