CSDL Home IEEE/ACM Transactions on Computational Biology and Bioinformatics 2013 vol.10 Issue No.05 - Sept.-Oct.
Issue No.05 - Sept.-Oct. (2013 vol.10)
Said Bleik , Inf. Syst. Dept., Univ. Heights, Newark, NJ, USA
Meenakshi Mishra , Dept. of Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA
Jun Huan , Dept. of Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA
Min Song , Dept. of Libr. & Inf. Sci., Yonsei Univ., Seoul, South Korea
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TCBB.2013.16
Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.
Kernel, Text categorization, Unified modeling language, Support vector machine classification, Graph representations, Semantics,textual and multimedia data, Text categorization, graph representations, graph kernels, biomedical ontologies, mining methods and algorithms, text mining, classifier design and evaluation, modeling structured
Said Bleik, Meenakshi Mishra, Jun Huan, Min Song, "Text Categorization of Biomedical Data Sets Using Graph Kernels and a Controlled Vocabulary", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.10, no. 5, pp. 1211-1217, Sept.-Oct. 2013, doi:10.1109/TCBB.2013.16