The Community for Technology Leaders
Green Image
Issue No. 03 - May-June (2013 vol. 10)
ISSN: 1545-5963
pp: 645-656
Gordon Gremme , Center for Bioinf., Univ. of Hamburg, Hamburg, Germany
Sascha Steinbiss , Center for Bioinf., Univ. of Hamburg, Hamburg, Germany
Stefan Kurtz , Center for Bioinf., Univ. of Hamburg, Hamburg, Germany
Genome annotations are often published as plain text files describing genomic features and their subcomponents by an implicit annotation graph. In this paper, we present the GenomeTools, a convenient and efficient software library and associated software tools for developing bioinformatics software intended to create, process or convert annotation graphs. The GenomeTools strictly follow the annotation graph approach, offering a unified graph-based representation. This gives the developer intuitive and immediate access to genomic features and tools for their manipulation. To process large annotation sets with low memory overhead, we have designed and implemented an efficient pull-based approach for sequential processing of annotations. This allows to handle even the largest annotation sets, such as a complete catalogue of human variations. Our object-oriented C-based software library enables a developer to conveniently implement their own functionality on annotation graphs and to integrate it into larger workflows, simultaneously accessing compressed sequence data if required. The careful C implementation of the GenomeTools does not only ensure a light-weight memory footprint while allowing full sequential as well as random access to the annotation graph, but also facilitates the creation of bindings to a variety of script programming languages (like Python and Ruby) sharing the same interface.
Bioinformatics, Genomics, Software, Computer languages, Ontologies, Software libraries,reusable libraries, text analysis, authoring languages, bioinformatics, data compression, genomics, graphs, object-oriented languages, Ruby, GenomeTools, comprehensive software library, efficient structured genome annotation processing, plain text files, genomic features, implicit annotation graph, efficient software library, associated software tools, bioinformatics software, annotation graph conversion, annotation graph processing, annotation graph creation, annotation graph approach, unified graph-based representation, low memory overhead, efficient pull-based approach, sequential processing, catalogue, human variations, object-oriented C-based software library, compressed sequence data, careful C implementation, light-weight memory footprint, random access, script programming languages, Python, Bioinformatics, Genomics, Software, Computer languages, Ontologies, Software libraries, programming environments, Scientific computing, biology and genetics, software engineering
Gordon Gremme, Sascha Steinbiss, Stefan Kurtz, "GenomeTools: A Comprehensive Software Library for Efficient Processing of Structured Genome Annotations", IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. , pp. 645-656, May-June 2013, doi:10.1109/TCBB.2013.68
359 ms
(Ver 3.3 (11022016))