This Article 
 Bibliographic References 
 Add to: 
Skew Angle Detection of Digitized Indian Script Documents
February 1997 (vol. 19 no. 2)
pp. 182-186

Abstract—Skew angle detection of scanned documents containing most popular Indian scripts (Devnagari and Bangla) is considered. Most characters in these scripts have horizontal lines at the top, called head lines. The character head lines mostly join one another in a word and the word appears as a single component. In the proposed method the components are at first labeled. The upper envelope of a component is found by columnwise scanning from an imaginary line above the component. Portions of upper envelope satisfying the properties of digital straight line are detected. They are clustered as belonging to single text lines. Estimates from individual clusters are combined to get the skew angle. Apart from accuracy and efficiency, an advantage of the method is that character segmentation and zone detection can be readily done from head line information, which is useful in Optical Character Recognition approaches of these scripts.

[1] T. Akiyama and N. Hagita, "Automatic Entry System for Printed Documents," Pattern Recognition, vol. 23, pp. 1,141-1,154, 1990.
[2] H.S. Baird, "The Skew Angle of Printed Documents," Proc. Society of Photographic Scientific Eng., vol. 40, pp. 21-24, 1987.
[3] B.B. Chaudhuri and U. Pal, "Relational Studies Between Phoneme and Grapheme Statistics in Modern Bangla Language," J. Acoustical Society of India, vol. 23, pp. 67-77, 1995.
[4] L.A. Fletcher and R. Kasturi, "A Robust Algorithm for Text String Separation From Mixed Text/Graphics Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, pp. 910-918, 1988.
[5] L. O'Gorman, "The Document Spectrum for Page Layout Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, pp. 1,162-1,173, 1993.
[6] A. Hashizume, P.S. Yeh, and A. Rosenfeld, "A Method of Detecting the Orientation of Aligned Components," Pattern Recognition Letters, vol. 4, pp. 125-132, 1986.
[7] S.C. Hinds, J.L. Fisher, and D.P. D'Amato, "A Document Skew Detection Method Using Run-Length Encoding and the Hough Transform," Proc. 10th Int'l Conf. Pattern Recognition, vol. 1, pp. 464-468, 1990.
[8] H.S. Hou, Digital Document Processing.New York: John Wiley, 1983.
[9] D.S. Le, G.R. Thoma, and H. Wechsler, "Automatic Page Orientation and Skew Angle Detection for Binary Document Images," Pattern Recognition, vol. 27, pp. 1,325-1,344, 1994.
[10] U. Pal and B.B. Chaudhuri, "Computer Recognition of Printed Bangla Script," Int'l J. Systems Science, vol. 26, pp. 2,107-2,123, 1995.
[11] U. Pal and B.B. Chaudhuri, "An Improved Document Skew Angle Estimation Technique," Pattern Recognition Letters, vol. 17, pp. 899-904, 1996.
[12] T. Pavlidis and J. Zhou, "Page Segmentation and Classification," Computer Vision Graphics and Image Processing, vol. 54, pp. 484-496, 1992.
[13] W. Postl, "Detection of Linear Oblique Structures and Skew in Digitized Documents," Proc. Eighth Int'l Conf. Pattern Recognition, pp. 464-468, 1986.
[14] A. Rosenfeld, "Digital Straight Line Segments," IEEE Trans. Computers, vol. 23, pp. 1,264-1,269, 1974.
[15] S.N. Srihari and Govindaraju, "Analysis of Textual Images Using the Hough Transform," Machine Vision Applications, vol. 2, pp. 141-153, 1989.
[16] K.Y. Wong, R.G. Casey, and F.M. Wahl, "Document Analysis System," IBM J. Res. Development, vol. 26, pp. 647-656, 1982.
[17] H. Yan, "Skew Correction of Document Images Using Interline Cross-Correlation," CVGIP: Graphical Models and Image Processing, vol. 55, pp. 538-543, 1993.

Index Terms:
Document processing, skew detection, optical character recognition (OCR), document structure analysis, digital library.
B.b. Chaudhuri, U. Pal, "Skew Angle Detection of Digitized Indian Script Documents," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 182-186, Feb. 1997, doi:10.1109/34.574803
Usage of this product signifies your acceptance of the Terms of Use.