The Community for Technology Leaders
2016 International Conference on Big Data and Smart Computing (BigComp) (2016)
Hong Kong, China
Jan. 18, 2016 to Jan. 20, 2016
ISSN: 2375-9356
ISBN: 978-1-4673-8795-8
pp: 321-324
Sunghee Lee , Program of Computer and Communications Engineering, College of IT, Kangwon National University, Chuncheon-si, Gangwon-do, Korea
Yeongkil Song , Program of Computer and Communications Engineering, College of IT, Kangwon National University, Chuncheon-si, Gangwon-do, Korea
Maengsik Choi , Program of Computer and Communications Engineering, College of IT, Kangwon National University, Chuncheon-si, Gangwon-do, Korea
Harksoo Kim , Program of Computer and Communications Engineering, College of IT, Kangwon National University, Chuncheon-si, Gangwon-do, Korea
ABSTRACT
Named entity recognition (NER) is a preliminary step to performing information extraction and question answering. Most previous studies on NER have been based on supervised machine learning methods that need a large amount of human-annotated training corpus. In this paper, we propose a semi-supervised NER model to minimize the time-consuming and labor-intensive task for constructing the training corpus. The proposed model generates weakly labeled training corpus using a distant supervision method. Then, it improves NER accuracy by refining the weakly labeled training corpus using a bagging-based active learning method. In the experiments, the proposed model outperformed the previous semi-supervised model. It showed F1-measure of 0.764 after 15 times of bagging-based active learning.
INDEX TERMS
Training, Dictionaries, Data models, Bagging, Biological system modeling, Refining, Data mining
CITATION

S. Lee, Y. Song, Maengsik Choi and H. Kim, "Bagging-based active learning model for named entity recognition with distant supervision," 2016 International Conference on Big Data and Smart Computing (BigComp)(BIGCOMP), Hong Kong, China, 2016, pp. 321-324.
doi:10.1109/BIGCOMP.2016.7425938
98 ms
(Ver 3.3 (11022016))