Proceedings of Sixth International Conference on Document Analysis and Recognition (2001)
Sept. 10, 2001 to Sept. 13, 2001
Léon Bottou , AT&T Labs - Research
Patrick Haffner , AT&T Labs - Research
Yann LeCun , AT&T Labs - Research
Abstract: How can we turn the description of a digital (i.e. electronically produced) document into something efficient for multilayer raster formats [1, 6, 4]? It is first shown that a foreground/background segmentation without overlapping foreground components can be more efficient for viewing or printing. Then, a new algorithm that prevents overlaps between foreground components while optimizing both the document quality and compression ratio is derived from the Minimum Description Length (MDL) criterion. This algorithm makes the DjVu compression format significantly more efficient on electronically produced documents. Comparisons with other formats are provided.
L. Bottou, P. Haffner and Y. LeCun, "Efficient Conversion of Digital Documents to Multilayer Raster Formats," Proceedings of Sixth International Conference on Document Analysis and Recognition(ICDAR), Seattle, Washington, 2001, pp. 0444.