This Article 
 Bibliographic References 
 Add to: 
Gelsius: A Literature-Based Workflow for Determining Quantitative Associations between Genes and Biological Processes
May-June 2013 (vol. 10 no. 3)
pp. 619-631
Francesco Abate, Polytech. of Turin, Turin, Italy
Andrea Acquaviva, Polytech. of Turin, Turin, Italy
Elisa Ficarra, Polytech. of Turin, Turin, Italy
Roberto Piva, Univ. of Turin, Turin, Italy
Enrico Macii, Polytech. of Turin, Turin, Italy
An effective knowledge extraction and quantification methodology from biomedical literature would allow the researcher to organize and analyze the results of high-throughput experiments on microarrays and next-generation sequencing technologies. Despite the large amount of raw information available on the web, a tool able to extract a measure of the correlation between a list of genes and biological processes is not yet available. In this paper, we present Gelsius, a workflow that incorporates biomedical literature to quantify the correlation between genes and terms describing biological processes. To achieve this target, we build different modules focusing on query expansion and document cononicalization. In this way, we reached to improve the measurement of correlation, performed using a latent semantic analysis approach. To the best of our knowledge, this is the first complete tool able to extract a measure of genes-biological processes correlation from literature. We demonstrate the effectiveness of the proposed workflow on six biological processes and a set of genes, by showing that correlation results for known relationships are in accordance with definitions of gene functions provided by NCI Thesaurus. On the other side, the tool is able to propose new candidate relationships for later experimental validation. The tool is available at
Index Terms:
query processing,document handling,genetics,knowledge acquisition,medical computing,Web,Gelsius,literature-based workflow,biological process,knowledge extraction,biomedical literature,microarrays,next-generation sequencing technology,query expansion,document cononicalization,latent semantic analysis approach,gene functions,NCI thesaurus,Correlation,Biological processes,Semantics,Unified modeling language,Abstracts,Large scale integration,Biomedical measurements,text mining,UMLS,gene ontology,thesaurus,ontologies
Francesco Abate, Andrea Acquaviva, Elisa Ficarra, Roberto Piva, Enrico Macii, "Gelsius: A Literature-Based Workflow for Determining Quantitative Associations between Genes and Biological Processes," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 3, pp. 619-631, May-June 2013, doi:10.1109/TCBB.2013.11
Usage of this product signifies your acceptance of the Terms of Use.