Web Intelligence, IEEE / WIC / ACM International Conference on (2007)
Silicon Valley, California, USA
Nov. 2, 2007 to Nov. 5, 2007
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/WI.2007.55
This paper proposes a method for learning ontologies given a corpus of text documents. The method identifies concepts in documents and organizes them into a subsumption hierarchy, without presupposing the existence of a seed ontology. The method uncovers latent topics in terms of which document text is being generated. These topics form the concepts of the new ontology. This is done in a language neutral way, using probabilistic space reduction techniques over the original term space of the corpus. Given multiple sets of concepts (latent topics) being discovered, the proposed method constructs a subsumption hierarchy by performing conditional independence tests among pairs of latent topics, given a third one. The paper provides experimental results over the GENIA corpus from the domain of biomedicine.
G. Paliouras, E. Zavitsanos, G. A. Vouros and S. Petridis, "Discovering Subsumption Hierarchies of Ontology Concepts from Text Corpora," 2007 IEEE/WIC/ACM International Conference on Web Intelligence(WI), Fremont, CA, 2007, pp. 402-408.