Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1 Indexing Historical Documents by Word Shape Signatures Curitiba, Parana, Brazil September 23-September 26 ISBN: 0-7695-2822-8
In this paper a word spotting approach to index archival image documents is presented. Indices are constructed from keyword images. The spotting strategy is formulated on an indexing-by-shape basis. The well known shape context de- scriptor is used to compute word image signatures from the skeleton points. Afterwards, codewords are extracted from thresholded shape contexts. It is a simpler and more com- pact representation based on bit vectors. Document images are roughly segmented into words and a lookup table is con- structed. Each word subimage is taken as a bin. Keyword images are spotted into documents by a voting strategy con- sisting in indexing into the lookup table by codewords, and voting into the corresponding bins. The approach is illus- trated by a real application scenario consisting of docu- ments from a digital archive of the Spanish Civil War.
Citation:
J. Llad?, G. Sanchez, "Indexing Historical Documents by Word Shape Signatures," icdar, vol. 1, pp.362-366, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 1, 2007 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||