This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 International Conference on Asian Language Processing
Automatic Acquisition of Large-Scale Academic Bilingual Parallel Corpus from the Web
Singapore
December 07-December 09
ISBN: 978-0-7695-3904-1
In this paper, we describe a system which automatically acquires large-scale Chinese-English bilingual parallel corpus from China Journals Full-text Database (CJFD), a component of China National Knowledge Infrastructure (CNKI). The system gets large amount of parallel texts with domain information from the existing structured bilingual texts in CJFD, such as Chinese and English abstracts and titles of academic articles. The acquired Chinese-English parallel corpus is by several orders of magnitudes larger than similar corpus we have known before. In addition, this system collects a large amount of bilingual terms which can directly apply to lexical acquisition.
Index Terms:
data mining, bilingual parallel corpora acquision, bilingual term acquision
Citation:
Han Yong, Li Yu, He Xiaoning, Yang Muyun, Lei Guohua, "Automatic Acquisition of Large-Scale Academic Bilingual Parallel Corpus from the Web," ialp, pp.318-321, 2009 International Conference on Asian Language Processing, 2009
Usage of this product signifies your acceptance of the Terms of Use.