The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2009 vol.31)
pp: 763-764
Published by the IEEE Computer Society
Faisal Shafait , German Research Center for Artificial Intelligence, Kaiserslautern
Daniel Keysers , German Research Center for Artificial Intelligence, Kaiserslautern
Thomas M. Breuel , Technical University of Kaiserslautern, Kaiserslautern
ABSTRACT
In contrast to prior experimental work, our results support the conclusion that RXYC can perform well after marginal noise removal. However, marginal noise removal on page images like those found in UW3 remains a hard problem, and it therefore remains an open question whether RXYC can actually achieve competitive performance on such databases.
Nagy et al. [ 1] misrepresent our work [ 2], [ 3] when they write:
Like Mao and Kanungo, Shafait et al. suggest that the poor performance of the X-Y tree method is due to its vulnerability to noise.
We do not conclude that RXYC has "poor performance." In fact, our paper strongly argues against such a simplistic, one-dimensional view of performance evaluation. The stated purpose of our paper was to introduce a novel evaluation method based on a vectorial score, and demonstrate its utility and validity by comparing it to the results obtained using Mao and Kanungo's method [ 4]. Our analysis shows, among other things, that Mao and Kanungo's conclusion that RXYC is the worst of the algorithms needs to be modified and our very first recommendation in [ 3] is (Section 4.3):
For clean documents with little or no skew, the x-y cut algorithm might be a good choice as it is fast and easy to implement.
In different words, if black borders and marginal noise have been successfully removed and if documents have been successfully deskewed, we tentatively recommend the use of RXYC. We can derive such a recommendation from our data precisely because the vectorial score lets us draw conclusions about the behavior of algorithms without testing all possible combinations of preprocessing methods and layout analysis methods. This is one illustration of the advantages of the vectorial score over a simple score and is the primary point of our paper.
Nagy et al. imply in their comment that the necessary document cleanup step is simple; for example, they write:
The page images tested in [ 1] and [ 2] were drawn from the University of Washington data set [ 5], which was evidently scanned against a black (or nonreflective) background. [...] A reasonable motivation for a nonreflective background is that detecting the edges of the paper greatly simplifies eliminating black pixels that do not belong to the page.
In fact, Nagy et al. are wrong regarding the origin and nature of the marginal noise in UW3. The source of the marginal noise in UW3 is documented [ 5]: UW3 contains pages from a wide variety of scanning conditions, including different page sizes and scans after manual photocopying. Contrary to what Nagy et al. state, UW3 does not contain "black borders" designed to be easy to remove, it contains unpredictable and variable marginal noise. This is also evident from looking at samples of UW3 page images (see Fig. 1).


Fig. 1. Sample images from the University of Washington Database 3 (UW3). The samples illustrate the variability and unpredictability of marginal noise in UW3. Across the entire database, noise outside the page margins of the scanned page consists of connected components at many different sizes and shapes, including actual text, and ranges from nearly absent to dominating the image. On other pages, illustrations outside the text area may resemble marginal noise and black borders. (a) A00IBIN. (b) A001BIN. (c) A031BIN. (d) D035BIN. (e) E001BIN.




In addition, a growing literature on marginal noise removal [ 6], [ 7], [ 8], [ 9], [ 10], [ 11], [ 12], suggests that marginal noise removal remains a difficult problem. We do not know of any algorithm (simple or otherwise) capable of reliably removing marginal noise components on UW3 page images to the degree required by RXYC. Therefore, although we considered it, testing combinations of RXYC and different marginal noise removal methods is not a "simple" experiment that we could have carried out as part of our evaluation, and as it was not directly relevant to the actual conclusions of our paper, we decided to leave this for future work.
We believe that the source of the "persistent flaw" that Nagy et al. perceive in subsequent work may lie in their own publications: Neither [ 13] nor [ 14] disclose a limitation of RXYC to documents scanned against a white background nor provide a border removal algorithm. If they had done so, subsequent work would have taken that into account.
Our paper makes the statement that our experimental results support: RXYC methods can perform well if marginal noise can be removed. This result represents a strong improvement over Mao and Kanungo's results, which simply stated that RXYC works poorly on UW3. Nagy et al.'s statement that border removal is "simple," however, is evidently false for UW3 and similar real-world databases; we note that, even in their letter, they fail to cite or state such an algorithm.
Determining whether RXYC can actually achieve competitive performance on document image collections like UW3 therefore remains a complex and open question that needs to be explored in future work.

    F. Shafait and D. Keysers are with the Image Understanding and Pattern Recognition (IUPR) Research Group, German Research Center for Artificial Intelligence (DFKI GmbH), D-67663 Kaiserslautern, Germany. E-mail: {faisal.shafait, daniel.keysers}@dfki.de.

    T.M. Breuel is with the Department of Computer Science, Technical University of Kaiserslautern, D-67663 Kaiserslautern, Germany.

    E-mail: tmb@informatik.uni-kl.de.

Manuscript received 25 July 2008; accepted 18 Aug. 2008; published online 27 Aug. 2008.

Recommended for acceptance by L. O'Gorman.

For information on obtaining reprints of this article, please send e-mail to: tpami@computer.org, and reference IEEECS Log Number TPAMI-2008-07-0444.

Digital Object Identifier no. 10.1109/TPAMI.2008.220.

REFERENCES

27 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool