Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2
Text - Image Separation in Devanagari Documents
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
In this paper we present a top-down, projection-profile based algorithm to separate text blocks from image blocks in a Devanagari document. We use a distinctive feature of Devanagari text, called Shirorekha (Header Line) to analyze the pattern produced by Devanagari text in the horizontal profile. The horizontal profile corresponding to a text block possesses certain regularity in frequency, orientation and shows spatial cohesion. The algorithm uses these features to identify text blocks in a document image containing both text and graphics.
Citation:
Swapnil Khedekar, Vemulapati Ramanaprasad, Srirangaraj Setlur, Venugopal Govindaraju, "Text - Image Separation in Devanagari Documents," icdar, vol. 2, pp.1265, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2, 2003