3rd Euromicro Workshop on Parallel and Distributed Processing
A parallel algorithm for "document segmentation"
San Remo, Italy
January 25-January 27
ISBN: 0-8186-7031-2
We present a parallel algorithm for physical segmentation of technical documents. The proposed method follows a "data parallel" approach, based on a divide and conquer implementation. A document page is statically partitioned into n equal-sized rectangular blocks, where n is the number of processors. Each processor independently finds a segmentation of its assigned block, according to the same rules: row/column or-ing and profile xor-ing. Each segmentation is stored in form of xy-tree. The computed trees are combined, in pairs and in parallel, without re-examining the original image. In the paper we prove that the independently computed xy-trees can be efficiently combined, without using the original image to form the global tree, obtained by a sequential application of the algorithm to the image. The method has been implemented on a LAN of workstations communicating through the PVM3 system.
Index Terms:
parallel algorithms; image segmentation; document image processing; divide and conquer methods; tree data structures; parallel algorithm; document segmentation; data parallel approach; divide and conquer implementation; xy-tree; xy-trees; PVM3 system
Citation:
M. Ancona, M. De Benedetto, "A parallel algorithm for "document segmentation"," pdp, pp.516, 3rd Euromicro Workshop on Parallel and Distributed Processing, 1995