This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Two-Phase Bio-NER System Based on Integrated Classifiers and Multiagent Strategy
July-Aug. 2013 (vol. 10 no. 4)
pp. 897-904
Lishuang Li, Coll. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
Wenting Fan, Coll. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
Degen Huang, Coll. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
Biomedical named entity recognition (Bio-NER) is a fundamental step in biomedical text mining. This paper presents a two-phase Bio-NER model targeting at JNLPBA task. Our two-phase method divides the task into two subtasks: named entity detection (NED) and named entity classification (NEC). The NED subtask is accomplished based on the two-layer stacking method in the first phase, where named entities (NEs) are distinguished from nonnamed-entities (NNEs) in biomedical literatures without identifying their types. Then six classifiers are constructed by four toolkits (CRF++, YamCha, maximum entropy, Mallet) with different training methods and integrated based on the two-layer stacking method. In the second phase for the NEC subtask, the multiagent strategy is introduced to determine the correct entity type for entities identified in the first phase. The experiment results show that the presented approach can achieve an F-score of 76.06 percent, which outperforms most of the state-of-the-art systems.
Index Terms:
text analysis,bioinformatics,classification,data mining,maximum entropy methods,medical computing,multi-agent systems,F-score,two-phase Bio-NER system,integrated classifier,multiagent strategy,biomedical named entity recognition system,biomedical text mining,two-phase Bio-NER model targeting,JNLPBA task,named entity detection subtask,NED subtask,named entity classification subtask,NEC subtask,two-layer stacking method,biomedical literature,CRF++ toolkit,YamCha toolkit,maximum entropy toolkit,Mallet toolkit,toolkit training method,correct entity type determination,Stacking,Biological system modeling,Training,Proteins,Hidden Markov models,RNA,Computational modeling,bioinformatics,Named entity recognition and classification,two-layer stacking method,multiagent
Citation:
Lishuang Li, Wenting Fan, Degen Huang, "A Two-Phase Bio-NER System Based on Integrated Classifiers and Multiagent Strategy," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 4, pp. 897-904, July-Aug. 2013, doi:10.1109/TCBB.2013.106
Usage of this product signifies your acceptance of the Terms of Use.