loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery
Detection of Protein Subcellular Localization Based on a Full Syntactic Parser and Semantic Information
October 18-October 20
ISBN: 978-0-7695-3305-6
A protein’s subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and semantic information. In the first step, we construct syntactic dependency paths from each protein to its location candidate. In the second step, we retrieve root information of the syntactic dependency paths. In the final step, we extract syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extract syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees, we extract (protein, localization) pairs. Even with no biomolecular knowledge, our method shows reasonable performance in experimental results using Medline abstract data. In fact, our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12–25%.
Citation:
Mi-Young Kim, "Detection of Protein Subcellular Localization Based on a Full Syntactic Parser and Semantic Information," fskd, vol. 4, pp.407-411, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, 2008
Usage of this product signifies your acceptance of the Terms of Use.