loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
16th International Conference on Pattern Recognition (ICPR'02) - Volume 3
Trainable Table Location in Document Images
Quebec City, QC, Canada
August 11-August 15
ISBN: 0-7695-1695-X
F. Cesari, Università di Firenzeni
S. Marinai, Università di Firenzeni
L. Sarti, Università di Siena
G. Soda, Università di Firenze

We describe an approach for table location in document images. The documents are described by means of a hierarchical representation that is based on the MXY tree. The presence of a table is hypothesized by searching parallel lines in the MXY tree of the page. This hypothesis is afterwards verified by locating perpendicular lines or white spaces in the region included between the parallel lines. Lastly,located tables can be merged on the basis of proximity and similarity criteria.

The use of an optimization method,that relies on the definition of an appropriate table location index, allows us to identify the optimal values of thresholds involved in the algorithm. In this way the algorithm can be adapted to recognize tables with different features by maximizing the performance on an appropriate training set.

The algorithm has been evaluated on two data-sets containing more than 1500 pages, and comparing its results with the tables identified by two commercial OCRs.

Citation:
F. Cesari, S. Marinai, L. Sarti, G. Soda, "Trainable Table Location in Document Images," icpr, vol. 3, pp.30236, 16th International Conference on Pattern Recognition (ICPR'02) - Volume 3, 2002
Usage of this product signifies your acceptance of the Terms of Use.