Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on (2011)
Aug. 22, 2011 to Aug. 27, 2011
Cross-lingual projection encounters two major challenges, the noise from word-alignment error and the syntactic divergences between two languages. To solve these two problems, a semi-supervised learning framework of cross-lingual projection is proposed to get better annotations using parallel data. Moreover, a projection model is introduced to model the projection process of labeling from the resource-rich language to the resource-scarce language. The projection model, together with the traditional target model of cross-lingual projection, can be seen as two views of parallel data. Utilizing these two views, an extension of co-training algorithm to structured predictions is designed to boost the result of the two models. Experiments show that the proposed cross-lingual projection method improves the accuracy in the task of POS-tagging projection. And using only one-to-one alignments proves to lead to more accurate results than using all kinds of alignment information.
cross-lingual projection, semi-supervised learning, structured predictions, pos tagging, co-training
J. Li, C. Zhu, T. Zhao, P. Hu and M. Yu, "Semi-supervised Learning Framework for Cross-Lingual Projection," 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies(WI-IAT), Lyon, 2011, pp. 213-216.