loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
IEEE Computer Society Bioinformatics Conference (CSB'03)
Text Pattern Visualization for analysis of biology full text and captions
Stanford, California
August 11-August 14
ISBN: 0-7695-2000-6
Andrea Elaina Grimes, Northeastern University
Robert P. Futrelle, Northeastern University
Large textbanks comprised of thousands of full-text biology papers are rapidly becoming available. We describe an approach to characterize all major language patterns in biology text in terms of Frameworks. Frameworks are "containers" made up of common phrases surrounding specific informational items such as gene and protein names. A Framework Viewer has been developed that shows similar text Frameworks aligned on the screen much as biosequence visualization tools do. Using the Viewer, it is evident that Frameworks have the power to find the types of structures needed to develop useful information retrieval systems. As a simple example, one Framework was able to concisely select 45,000 nouns from a corpus of 5 million words without error. This work points the way to highly automated systems that will be able to extract and index information in biology textbanks. Work in progress includes extensions to characterize recursive structures in text, subsystems to retrieve figures in papers, and the discovery of semantic relations to aid concept-based retrieval.
Citation:
Andrea Elaina Grimes, Robert P. Futrelle, "Text Pattern Visualization for analysis of biology full text and captions," csb, pp.648, IEEE Computer Society Bioinformatics Conference (CSB'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.