Research in automatic Part of Speech (POS) tagging has been dominated by Markov Model (MM) taggers. Eric Brill has recently described a transformation-based system with comparable accuracy, and simpler algorithms and representation than MM taggers. We present a set-based formal model of natural language ambiguity and semantic tagging that forms a basis for the generalization of the transformation-based learning (TBL) and Brill's TBL tagger. We discuss empirical observations of the training algorithm that suggest a new evolutionary transformation learning strategy may dramatically improve learning time without loss of accuracy.
Index Terms:
NLP, Part of Speech Tagging, Machine Learning
Citation:
James R. Curran, Raymond K. Wong, "Formalization of Transformation-Based Learning," acsc, pp.51, Australasian Computer Science Conference, 2000