
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Richard Zanibbi, Dorothea Blostein, James R. Cordy, "Recognizing Mathematical Expressions Using Tree Transformation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 14551467, November, 2002.  
BibTex  x  
@article{ 10.1109/TPAMI.2002.1046157, author = {Richard Zanibbi and Dorothea Blostein and James R. Cordy}, title = {Recognizing Mathematical Expressions Using Tree Transformation}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {24}, number = {11}, issn = {01628828}, year = {2002}, pages = {14551467}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2002.1046157}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Recognizing Mathematical Expressions Using Tree Transformation IS  11 SN  01628828 SP1455 EP1467 EPD  14551467 A1  Richard Zanibbi, A1  Dorothea Blostein, A1  James R. Cordy, PY  2002 KW  Document image analysis KW  recognition of mathematical notation KW  diagram recognition KW  tree transformation KW  graphics recognition. VL  24 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Abstract—We describe a robust and efficient system for recognizing typeset and handwritten mathematical notation. From a list of symbols with bounding boxes the system analyzes an expression in three successive passes. The Layout Pass constructs a Baseline Structure Tree (BST) describing the twodimensional arrangement of input symbols. Reading order and operator dominance are used to allow efficient recognition of symbol layout even when symbols deviate greatly from their ideal positions. Next, the Lexical Pass produces a Lexed BST from the initial BST by grouping tokens comprised of multiple input symbols; these include decimal numbers, function names, and symbols comprised of nonoverlapping primitives such as “=”. The Lexical Pass also labels vertical structures such as fractions and accents. The Lexed BST is translated into
[1] D. Blostein and A. Grbavec, “Recognition of Mathematical Notation,” Handbook of Character Recognition and Document Image Analysis, pp. 557582, World Scientific, 1997.
[2] K. Chan and D. Yeung, “Mathematical Expression Recognition: A Survey,” Int'l J. Document Analysis and Recognition, vol. 3, no. 1, pp. 315, Aug. 2000.
[3] A. Kacem, A. Belaïd, and M.B. Ahmed, “Automatic Extraction of Printed Mathematical Formulas Using Fuzzy Logic and Propogation of Context,” Int'l. J. Document Analysis and Recognition, vol. 4, no. 2, pp. 97108, Dec. 2001.
[4] R.J. Fateman, “How to Find Mathematics on a Scanned Page,” Proc. SPIE, vol. 3967, pp. 98109, 1999.
[5] B.P. Berman and R.J. Fateman, “Optical Character Recognition for Typeset Mathematics,” Proc. Int'l Symp. Symbolic and Algebraic Computation, pp. 348353, 1994.
[6] Z.X. Wang and C. Faure, “Structural Analysis of Handwritten Mathematical Expressions,” Proc. Ninth Int'l Conf. Pattern Recognition, pp. 3234, 1988.
[7] E.G. Miller and P.A. Viola, “Ambiguity and Constraint in Mathematical Expression Recognition,” Proc. 15th Nat'l Conf. Artificial Intelligence, pp. 784791, 1998.
[8] R. Zanibbi, “Recognition of Mathematics Notation via Computer Using Baseline Structure,” Technical Report, ISBN083602272000439, School of Computing, Queen's Univ., Aug. 2000.
[9] S. Srihari, “From Pixels to Paragraphs: The Use of Contextual Models in Text Recognition,” Proc. Second Int'l. Conf. Document Analysis and Recognition, pp. 416423, 1993.
[10] T.W. Chaundy, P.R. Barrett, and C. Batey, The Printing of Mathematics. London: Oxford Univ. Press, 1957.
[11] N.J. Higham, Handbook of Writing for the Mathematical Sciences. Philadelphia: SIAM, 1993.
[12] D.E. Knuth, $\big. \TeX\bigr.$and METAFONT—New Directions in Typesetting. Digital Press, 1979.
[13] G. Costagliola, A. De Lucia, and S. Orefice, “A Parsing Methodology for the Implementation of Visual Systems,” IEEE Trans. Software Eng., vol. 23, no. 12, pp. 777799, Dec. 1997.
[14] H. Lee and M. Lee, “Understanding Mathematical Expressions Using ProcedureOriented Transformation,” Pattern Recognition, vol. 27, no. 3, pp. 447457, 1994.
[15] K. Inoue, R. Miyazaki, and M. Suzuki, “Optical Recognition of Printed Mathematical Documents,” Proc. Third Asian Technology Conf. Math., pp. 280289, 1998.
[16] A. Aho, V. Jeffrey, and D. Ullman, Principles of Compiler Design. AddisonWesley, 1977.
[17] R. Zanibbi, D. Blostein, and J.R. Cordy, “Baseline Structure Analysis of Handwritten Mathematics Notation,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 768773, 2001.
[18] S. Smithies, K. Novins, and J. Arvo, “Equation Entry and Editing via Handwriting and Gesture Recognition,” Behaviour and Information Technology, vol. 20, no. 1, pp. 5367, 2001.
[19] R. Zanibbi, K. Novins, J. Arvo, and K. Zanibbi, “Aiding Manipulation of Handwritten Mathematical Expressions through StylePreserving Morphs,” Proc. Graphics Interface, pp. 127134, 2001.
[20] P.A. Chou, “Recognition of Equations Using a TwoDimensional Stochastic ContextFree Grammar,” Visual Comm. and Image Processing IV, pp. 852863, 1989.
[21] A. Grbavec and D. Blostein, “Mathematics Recognition Using Graph Rewriting,” Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 417421, 1995.
[22] J.R. Cordy, I. Charmichael, and R. Halliday, The TXL Programming LanguageVersion 10, Jan. 2000.
[23] J.R. Cordy, C.D. Halpern, and E. Promislow, TXL: A Rapid Prototyping System for Programming Language Dialects Computer Languages, vol. 16, no. 1, pp. 97107, Jan. 1991.
[24] S. Chang, “A Method for the Structural Analysis of TwoDimensional Mathematical Expressions,” Information Sciences, vol. 2, pp. 253272, 1970.
[25] R.H. Anderson, “SyntaxDirected Recognition of HandPrinted TwoDimensional Equations.” PhD thesis, Harvard Univ., Jan. 1968.
[26] M. Okamoto and B. Miao, “Recognition of Mathematical Expressions by Using the Layout Structures of Symbols,” Proc. First Int'l Conf. Document Analysis and Recognition, vol. 1, pp. 242250, 1991.
[27] S. Wolfram, The Mathematica Book, version 4. Cambridge Univ. Press, 1999.
[28] F. Garvan, The MAPLE Book. CRC Press, 2001.
[29] I. Phillips, “Methodologies for Using UW Databases for OCR and Image Understanding Systems,” Document Recognition V, SPIE Proc., vol. 3305, pp. 112127, 1998.
[30] I. Phillips and A. Chhabra, “Empirical Performance Evaluation of Graphics Recognition Systems,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 9, pp. 849870, Sept. 1999.
[31] E. Brill, “TransformationBased ErrorDriven Parsing,” Recent Advances in Parsing Technology, pp. 113, Kluwer Academic, 1996.
[32] E. Charniak, Statistical Language Learning. MIT Press, 1993.
[33] U. Garain and B.B. Chaudhuri, “On Developmentand Statistical Analysis of a Corpus for Printed and Handwritten Expressions,” Proc. Fourth Int'l IAPR Workshop Graphics Recognition, pp. 429439, Sept. 2001.
[34] A. Belaïd and J. Haton, “A Syntactic Approach for Handwritten Mathematical Formula Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 1, pp. 105111, Jan. 1984.
[35] M. Okamoto, H. Imai, and K. Takagi, “Performance Evaluation of a Robust Method for Mathematical Expression Recognition,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 121128, 2001.
[36] K. Chan and D. Yeung, “Error Detection, Error Correction and Performance Evaluation in OnLine Mathematical Expression Recognition,” Pattern Recognition, vol. 34, pp. 16711684, 2001.
[37] Y. Eto and M. Suzuki, “Mathematical Formula Recognition Using Virtual Link Network,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 762767, 2001.
[38] R. Fukuda, S. I, F. Tamari, X. Ming, and M. Suzuki, “A Technique of Mathematical Expression Structure Analysis for the Handwriting Input System,” Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 131134, 1999.
[39] T. Kanahori and M. Suzuki, “A Recognition Method of Matrices by Using Variable Block Pattern Elements Generating Rectangular Area,” Proc. Fourth Int'l IAPR Workshop Graphics Recognition, pp. 455469, 2001.
[40] K. Toyozumi, T. Suzuki, K. Mori, and Y. Suenega, “A System for RealTime Recognition of Handwritten Mathematical Formulas,” Proc. Sixth Int'l Conf. Document Analysis and Recognition, pp. 10591063, 2001.
[41] H. Winkler, H. Fahrner, and M. Lang, “A Soft Decision Approach for Structural Analysis of Handwritten Mathematical Expressions,” Proc. Int'l Conf. Acoustics, Speech and Signal Processing, pp. 24592462, 1995.
[42] C. Faure and Z.X. Wang, “Automatic Perception of the Structure of Handwritten Mathematical Expressions,” Computer Processing of Handwriting, R. Plamondon and C.G. Leedham, eds., pp. 337361, World Scientific, 1990.
[43] R. Zanibbi, D. Blostein, and J.R. Cordy, “Directions in Recognizing Tabular Structures of Handwritten Mathematics Notation,” Proc. Fourth Int'l IAPR Workshop Graphics Recognition, pp. 493499, Sept. 2001.
[44] R.H. Anderson, “TwoDimensional Mathematical Notation,” Syntactic Pattern Recognition, K.S. Fu, ed., pp. 147177, New York: Springer Verlag, 1977.