The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2008 vol.30)
pp: 591-605
Jian Liang , IEEE
ABSTRACT
Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and non-contactimage capture, which enables many new applications and breathes new life into existing ones. However,camera-captured documents may suffer from distortions caused by non-planar document shape andperspective projection, which lead to failure of current OCR technologies. We present a geometricrectification framework for restoring the frontal-flat view of a document from a single camera-capturedimage. Our approach estimates 3D document shape from texture flow information obtained directlyfrom the image without requiring additional 3D/metric data or prior camera calibration. Our frameworkprovides a unified solution for both planar and curved documents and can be applied in many, especiallymobile, camera-based document analysis applications. Experiments show that our method producesresults that are significantly more OCR compatible than the original images.
INDEX TERMS
Camera-based OCR, image rectification, shape estimation, texture flow analysis.
CITATION
Jian Liang, Daniel DeMenthon, David Doermann, "Geometric Rectification of Camera-Captured Document Images", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 4, pp. 591-605, April 2008, doi:10.1109/TPAMI.2007.70724
REFERENCES
[1] J. Liang, D. Doermann, and H. Li, “Camera-Based Analysis of Text and Documents: A Survey,” Int'l J. Document Analysis and Recognition, vol. 7, no. 2-3, pp. 84-104, July 2005.
[2] M.J. Taylor, A. Zappala, W.M. Newman, and C.R. Dance, “Documents through Cameras,” Image and Vision Computing, vol. 17, no. 11, pp. 831-844, 1999.
[3] L. O'Gorman, “The Document Spectrum for Page Layout Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.
[4] G. Nagy, S. Seth, and M. Viswanathan, “A Prototype Document Image Analysis System for Technical Journals,” Computer, vol. 25, no. 7, pp. 10-22, July 1992.
[5] A.K. Jain and B. Yu, “Document Representation and Its Application to Page Decomposition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 294-308, Mar. 1998.
[6] M.S. Brown and W.B. Seales, “Image Restoration of Arbitrarily Warped Documents,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 10, pp. 1295-1306, Oct. 2004.
[7] S. Pollard and M. Pilu, “Building Cameras for Capturing Documents,” Int'l J. Document Analysis and Recognition, vol. 7, nos. 2-3, pp. 123-137, July 2005.
[8] A. Ulges, C.H. Lampert, and T. Breuel, “Document Capture Using Stereo Vision,” Proc. 2004 ACM Symp. Document Eng., pp. 198-200, 2004.
[9] P. Clark and M. Mirmehdi, “On the Recovery of Oriented Documents from Single Images,” Proc. Advanced Concepts for Intelligent Vision Systems, pp. 190-197, 2002.
[10] H. Cao, X. Ding, and C. Liu, “A Cylindrical Surface Model to Rectify the Bound Document Image,” Proc. Ninth Int'l Conf. Computer Vision, vol. 1, p. 228, 2003.
[11] Y.-C. Tsoi and M.S. Brown, “Geometric and Shading Correction for Images of Printed Materials a Unified Approach Using Boundary,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 240-246, 2004.
[12] N. Gumerov, A. Zandifar, R. Duraiswarni, and L.S. Davis, “Structure of Applicable Surfaces from Single Views,” Proc. Eighth European Conf. Computer Vision, pp. 482-496, 2004.
[13] Z. Zhang and C.L. Tan, “Correcting Document Image Warping Based on Regression of Curved Text Lines,” Proc. Seventh Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 589-593, 2003.
[14] P. Clark and M. Mirmehdi, “Estimating the Orientation and Recovery of Text Planes in a Single Image,” Proc. British Machine Vision Conf., pp. 421-430, 2001.
[15] A. Ulges, C.H. Lampert, and T.M. Breuel, “Document Image Dewarping Using Robust Estimation of Curled Text Lines,” Proc. Eighth Int'l Conf. Document Analysis and Recognition, pp. 1001-1005, 2005.
[16] H. Cao, X. Ding, and C. Liu, “Rectifying the Bound Document Image Captured by the Camera: A Model Based Approach,” Proc. Seventh Int'l Conf. Document Analysis and Recognition, pp. 71-75, 2003.
[17] A. Zappala, A. Gee, and M.J. Taylor, “Document Mosaicing,” Image and Vision Computing, vol. 17, no. 8, pp. 585-595, 1999.
[18] T. Nakao, A. Kashitani, and A. Kaneyoshi, “Scanning a Document with a Small Camera Attached to a Mouse,” Proc. Fourth IEEE Workshop Applications of Computer Vision, pp. 63-68, 1998.
[19] G.K. Myers, R.C. Bolles, Q.-T. Luong, J.A. Herson, and H.B. Aradhye, “Rectification and Recognition of Text in 3-D Scenes,” Int'l J. Document Analysis and Recognition, vol. 7, nos. 2-3, pp. 147-158, July 2005.
[20] J. Malik and R. Rosenholtz, “Computing Local Surface Orientation and Shape from Texture for Curved Surfaces,” Int'l J. Computer Vision, vol. 23, no. 2, pp. 149-168, 1997.
[21] J. Gårding, “Shape from Texture for Smooth Curved Surfaces in Perspective Projection,” J. Math. Imaging and Vision, vol. 2, pp. 327-350, 1992.
[22] O. Ben-Shahar and S.W. Zucker, “The Perceptual Organization of Texture Flow: A Contextual Inference Approach,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 4, pp. 401-417, Apr. 2003.
[23] A.R. Rao and R.C. Jain, “Computerized Flow Field Analysis: Oriented Texture Fields,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, no. 7, pp. 693-709, July 2003.
[24] D.C. Knill, “Contour into Texture: Information Content of Surface Contours and Texture Flow,” J. Optical Soc. Am. Assoc., vol. 18, no. 1, pp. 12-35, Jan. 2001.
[25] J. Liang, D. DeMenthon, and D. Doermann, “Flattening Curved Documents in Images,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 338-345, 2005.
[26] D.X. Le, G.R. Thomas, and H. Weschler, “Automated Page Orientation and Skew Angle Detection for Binary Document Images,” Pattern Recognition, vol. 27, no. 10, pp. 1325-1344, 1994.
[27] R.A. Hummel and S.W. Zucker, “On the Foundations of Relaxation Labeling Processes,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 5, pp. 267-287, 1983.
[28] J. Liang, “Processing Camera-Captured Document Images: Geometric Rectification, Mosaicing, and Layout Structure Recognition,” PhD dissertation, Univ. of Maryland, College Park, 2006.
[29] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2000.
[30] D. Liebowitz and A. Zisserman, “Metric Rectification for Perspective Images of Planes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 482-488, 1998.
[31] A. Vailaya, H. Zhang, C. Yang, F.-I. Liu, and A. Jain, “Automatic Image Orientation Detection,” IEEE Trans. Image Processing, vol. 11, no. 7, pp. 746-755, 2002.
[32] D.B. Wagner, “Dynamic Programming,” The Mathematica J., vol. 5, no. 4, pp. 42-51, 1995.
[33] T. Coleman and Y. Li, “An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds,” SIAM J. Optimization, vol. 6, pp. 418-445, 1996.
23 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool