The Community for Technology Leaders
RSS Icon
Issue No.04 - April (2003 vol.25)
pp: 519-523
Marco Gori , IEEE
<p><b>Abstract</b>—Classification is an important problem in image document processing and is often a preliminary step toward recognition, understanding, and information extraction. In this paper, the problem is formulated in the framework of concept learning and each category corresponds to the set of image documents with similar physical structure. We propose a solution based on two algorithmic ideas. First, we obtain a structured representation of images based on labeled XY-trees (this representation informs the learner about important relationships between image subconstituents). Second, we propose a probabilistic architecture that extends hidden Markov models for learning probability distributions defined on spaces of labeled trees. Finally, a successful application of this method to the categorization of commercial invoices is presented.</p>
Document classification, machine learning, Markovian models, structured information.
Michelangelo Diligenti, Paolo Frasconi, Marco Gori, "Hidden Tree Markov Models for Document Image Classification", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.25, no. 4, pp. 519-523, April 2003, doi:10.1109/TPAMI.2003.1190578
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool