|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Carol Lushbough, Michael K. Bergman, Carolyn J. Lawrence, Doug Jennewein, Volker Brendel, "BioExtract Server—An Integrated Workflow-Enabling System to Access and Analyze Heterogeneous, Distributed Biomolecular Data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 1, pp. 12-24, January-March, 2010. | |||
| BibTex | x | ||
| @article{ 10.1109/TCBB.2008.98, author = {Carol Lushbough and Michael K. Bergman and Carolyn J. Lawrence and Doug Jennewein and Volker Brendel}, title = {BioExtract Server—An Integrated Workflow-Enabling System to Access and Analyze Heterogeneous, Distributed Biomolecular Data}, journal ={IEEE/ACM Transactions on Computational Biology and Bioinformatics}, volume = {7}, number = {1}, issn = {1545-5963}, year = {2010}, pages = {12-24}, doi = {http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.98}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics TI - BioExtract Server—An Integrated Workflow-Enabling System to Access and Analyze Heterogeneous, Distributed Biomolecular Data IS - 1 SN - 1545-5963 SP12 EP24 EPD - 12-24 A1 - Carol Lushbough, A1 - Michael K. Bergman, A1 - Carolyn J. Lawrence, A1 - Doug Jennewein, A1 - Volker Brendel, PY - 2010 KW - Bioinformatics (genome or protein) databases KW - data integration KW - distributed architectures KW - heterogeneous databases KW - mash-up KW - scientific workflow automation. VL - 7 JA - IEEE/ACM Transactions on Computational Biology and Bioinformatics ER - | |||
[1] S. Philippi, "Light-Weight Integration of Molecular Biological Databases," Bioinformatics, vol. 20, no. 1, pp. 51-57, 2004.
[2] L. Stein, "Integrating Biological Databases," Nature Rev. Genetics, vol. 4, no. 5, pp. 337-345, 2003.
[3] D.L. Wheeler, T. Barrett, D.A. Benson, S.H. Bryant, K. Canese, V. Chetvernin, D.M. Church, M. DiCuccio, R. Edgar, S. Federhen, L.Y. Geer, Y. Kapustin, O. Khovayko, D. Landsman, D.J. Lipman, T.L. Madden, D.R. Maglott, J. Ostell, V. Miller, K.D. Pruitt, G.D. Schuler, E. Sequeira, S.T. Sherry, K. Sirotkin, A. Souvorov, G. Starchenko, R.L. Tatusov, T.A. Tatusova, L. Wagner, and E. Yaschenko, "Database Resources of the National Center for Biotechnology Information," Nucleic Acids Research, vol. 35, database issue, pp. D5-D12, 2007.
[4] V.M. Markowitz and O. Ritter, "Characterizing Heterogeneous Molecular Biology Database Systems," J. Computational Biology, vol. 2, no. 4, pp. 547-556, 1995.
[5] S.Y. Chung and J.C. Wooley, "Challenges Faced in the Integration of Biological Information," Bioinformatics: Managing Scientific Data, Z. Lacroix and T. Critchlow, eds., chapter 2, pp. 21-24, Morgan Kaufmann, 2003.
[6] S.B. Davidson, J. Crabtree, B.P. Brunk, J. Schug, V. Tannen, G.C. Overton, and C.J. Stoeckert Jr, "K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources," IBM Systems J., vol. 40, no. 2, pp. 512-531, http://www.gusdb.orgabout.php, 2001.
[7] T.J. Lee, Y. Pouliot, V. Wagner, P. Gupta, D.W.J. Stringer-Calvert, J.D Tenenbaum, and P.D. Karp, "BioWarehouse: A Bioinformatics Database Warehouse Toolkit," BMC Bioinformatics, vol. 7, p. 170, http:/biowarehouse.ai.sri.com/, 2006.
[8] E. Zdobnov, R. Lopez, R. Apweiler, and T. Etzold, "The EBI SRS Server—Recent Developments," Bioinformatics, vol. 18, no. 2, pp. 368-373, 2002.
[9] V. Tannen, S. Davidson, and S. Harker, "The Information Integration System K2," Bioinformatics: Managing Scientific Data, Z. Lacroix and T. Critchlow, eds., chapter 8, pp. 225-248, Morgan Kaufmann, 2003.
[10] R. Stevens, P. Baker, S. Bechhofer, G. Ng, A. Jacoby, N. Paton, C. Goble, and A. Brass, "TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources," Bioinformatics, vol. 16, no. 2, pp. 184-186, 2000.
[11] M.W. Bright, A.R. Hurson, and S.H. Pakzad, "A Taxonomy and Current Issues in Multidatabase Systems," Computer, vol. 25, no. 3, pp. 50-60, 1992.
[12] A.P. Sheth and J.A. Larson, "Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases," ACM Computing Surveys, vol. 22, no. 3, pp. 183-236, 1990.
[13] G. Wiederhold and M. Genesereth, "The Conceptual Basis for Mediation Services," IEEE Expert, vol. 12, no. 5, pp. 38-47, 1997.
[14] http://www.webopedia.com/TERM/mmash_up.html , 2008.
[15] M. Galperin, "The Molecular Biology Database Collection: 2007 Update," Nucleic Acids Research, vol. 35, database issue, pp. D3-D4, 2007.
[16] H. Sun, S. Palaniswamy, T. Pohar, V. Jin, and R.V. Davuluri, "MPromDb: An Integrated Resource for Annotation and Visualization of Mammalian Gene Promoters and ChIP-Chip Experimental Data," Nucleic Acids Research, vol. 34, database issue, pp. D98-D103, 2006.
[17] S. Griffiths-Jones, R.J. Grocock, S. van Dongen, A. Bateman, and A.J. Enright, "miRBase: microRNA Sequences, Targets and Gene Nomenclature," Nucleic Acids Research, vol. 34, database issue, pp. D140-D144, 2006.
[18] A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E.L.L. Sonnhammer, D.J. Studholme, C. Yeats, and S.R. Eddy, "The Pfam Protein Families Database," Nucleic Acids Research, vol. 32, database issue, pp. D138-D141, 2004.
[19] A. Chatr-aryamontri, A. Ceol, L.M. Palazzi, G. Nardelli, M.V. Schneider, L. Castagnoli, and G. Cesareni, "MINT: The Molecular INTeraction Database," Nucleic Acids Research, vol. 35, database issue, pp. D572-D574, 2007.
[20] J. Demeter, C. Beauheim, J. Gollub, T. Hernandez-Boussard, H. Jin, D. Maier, J.C. Matese, M. Nitzberg, F. Wymore, Z.K. Zachariah, P.O. Brown, G. Sherlock, and C.A. Ball, "The Stanford Microarray Database: Implementation of New Analysis Tools and Open Source Release of Software," Nucleic Acids Research, vol. 35, database issue, pp. D766-D770, 2007.
[21] D. Benson, I. Karsch-Mizrachi, D. Lipman, J. Ostell, and D. Wheller, "GenBank," Nucleic Acids Research, vol. 34, database issue, pp. D16-D20, 2007.
[22] T. Kulikova, R. Akhtar, P. Aldebert, N. Althorpe, M. Andersson, A. Baldwin, K. Bates, S. Bhattacharyya, L. Bower, P. Browne, M. Castro, G. Cochrane, K. Duggan, R. Eberhardt, N. Faruque, G. Hoad, C. Kanz, C. Lee, R. Leinonen, Q. Lin, V. Lombard, R. Lopez, D. Lorenc, H. McWilliam, G. Mukherjee, F. Nardone, M. Pilar Garcia Pastor, S. Plaister, S. Sobhany, P. Stoehr, R. Vaughan, D. Wu, W. Zhu, and R. Apweiler, "EMBL Nucleotide Sequence Database in 2006," Nucleic Acids Research, vol. 35, database issue, pp. D16-D20, 2007.
[23] J. Duvick, A. Fu, U. Muppirala, M. Sabharval, M.D. Wilkerson, C.J. Lawrence, C. Lushbough, and V. Brendel, "PlantGDB: A Resource for Comparative Plant Genomics," Nucleic Acids Research, vol. 36, database issue, 2007, doi: 10.1093/nar/gkm1041.
[24] C. Lushbough and T. Tiahrt, "Field Stream Database System— Data Mining Storage for Biological Data," unpublished, http://www.usd.edu/csci/bioinformaticsFieldStreamPaper.pdf , 2005.
[25] S.F. Altschul, T.L. Madden, A.A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D.J. Lipman, "Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs," Nucleic Acids Research, vol. 25, no. 17, pp. 3389-3402, 1997.
[26] M. Senger, P. Rice, and T. Oinn, "Soaplab—A Unified Sesame Door to Analysis Tools," Proc. UK e-Science All Hands Meeting '03, pp. 509-513, http://www.nesc.ac.uk/events/ahm2003/AHMCD/ pdf115.pdf, Sept. 2003.
[27] M.D. Wilkinson and M. Links, "BioMOBY: An Open-Source Biological Web Services Proposal," Brief Bioinformatics, vol. 3, no. 4, pp. 331-341, 2002.
[28] D. Hull, R. Stevens, P. Lord, and C. Goble, "Integrating Bioinformatics Resources Using Shims," Proc. 12th Int'l Conf. Intelligent Systems for Molecular Biology (ISMB '04), http://www.iscb.org/ismb2004/postersduncan.hullATcs.man.ac.uk_445.html , 2004.
[29] E. Deelman and Y. Gil, Workshop on the Challenges of Scientific Workflows, sponsored by the Nat'l Science Foundation, http://www.isi.edunsf-workflows06, May 2006.
[30] T. Barkman, T. Martins, E. Sutton, and J. Stout, "Positive Selection for Single Amino Acid Change Promotes Substrate Discrimination of a Plant Volatile-Producing Enzyme," Molecular Biology and Evolution, vol. 24, no. 6, pp. 1320-1329, 2007.
[31] R. Chenna, H. Sugawara, T. Koike, R. Lopez, T.J. Gibson, D.G. Higgins, and J.D. Thompson, "Multiple Sequence Alignment with the Clustal Series of Programs," Nucleic Acids Research, vol. 31, no. 13, pp. 3497-3500, 2003.
[32] A. Buccella and A. Cechich, "An Ontology Approach to Data Integration," J. Computer Science and Technology, vol. 3, no. 2, pp. 62-68, 2003.
[33] C. Pluempitiwiriyawej and J. Hammer, "A Classification Scheme for Semantic and Schematic Heterogeneities in XML Data Sources," Technical Report TR00-004, Univ. of Florida, http://www.cise.ufl.edu/tech_reports/tr00 tr00-004.pdf, Sept. 2000.
[34] J. Yu and R. Buyya, "A Taxonomy of Scientific Workflow Systems for Grid Computing," SIGMOD Record, vol. 34, no. 3, pp. 44-49, Sept. 2005.
[35] Sun Microsystems, Inc., "Java Message Service," http://java.sun. com/products/jmsdocs.html , 2002.
[36] Sun Microsystems, Inc., Sun ONE Application Framework Overview, http://docs.sun.com/source817-4360/, 2002.
[37] B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E.A. Lee, J. Tao, and Y. Zhao, "Scientific Workflow Management and the Kepler System," Concurrency and Computation: Practice and Experience, vol. 18, no. 10, pp. 1039-1065, 2006.
[38] L.M. Haas, P.M. Schwarz, E. Kodali, E. Kotlar, J. Rice, and W.C. Swope, "DiscoveryLink: A System for Integrated Access to Life Sciences Data Sources," IBM System J., vol. 40, no. 2,0018-8670/01, 2001.
[39] D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. Pocock, P. Li, and T. Oinn, "Taverna: A Tool for Building and Running Workflows of Services," Nucleic Acids Research, vol. 34, Web Server issue, pp. W729-W732, 2006.
[40] T. Hernandez and S. Kambhampati, "Integration of Biological Sources: Current Systems and Challenges Ahead," Proc. ACM SIGMOD '04, vol. 33, no. 3, pp. 51-60, 2004.
[41] N.W. Paton, R. Stevens, P. Baker, C.A. Goble, S.S. Bechhofer, and A. Brass, "Query Processing in the TAMBIS Bioinformatics Source Integration System," Proc. 11th Int'l Conf. Scientific and Statistical Database Management (SSDBM '99), pp. 138-147, 1999.
[42] R. Stevens, C. Goble, N. Paton, S. Bechhofer, G. Ng, P. Baker, and A. Brass, "Complex Query Formulation over Diverse Information Sources Using an Ontology," Proc. Workshop Computation of Biochemical Pathways and Genetic Networks, European Media Lab (EML '99), http://www.cs.man.ac.uk/~stevensr/papers eml99.pdf, pp. 83-88, 1999.
[43] A. Goderis, C. Brooks, I. Altintas, E.A. Lee, and C.A. Goble, "Composing Different Models of Computation in Kepler and Ptolemy II," Proc. Int'l Conf. Computational Science (ICCS '07), http://www.mygrid.org.uk/wiki/bin/viewfile/ MygridPapersStore?rev=1;filename=final_in_8_pages.pdf , May 2007.
[44] Kepler Project, "Getting Started with Kepler," http://kepler-project.orgWiki.jsp?page=Documentation , 2008.
[45] S. Bowers, T. McPhillips, S. Riddle, M. Anand, and B. Ludaescher, "Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life," Proc. Int'l Provenance Annotation Workshop (IPAW), 2008.
[46] A. Barker and J. van Hemert, "Scientific Workflow: A Survey and Research Directions," Proc. Seventh Int'l Conf. Parallel Processing and Applied Math. (PPAM '08), revised selected papers, Roman Wyrzykowski et al., eds., pp. 746-753, 2008.
[47] D. Butler, "Mashups Mix Data into Global Service," Nature, vol. 439, pp. 6-7, 2006.
[48] P. Ferragina and R. Grossi, "The String B-Tree: A New Data Structure for String Search in External Memory and Its Applications," J. ACM, vol. 46, no. 2, pp. 236-280, 1999.
[49] S. Heinz, J. Zobel, and H. Williams, "Burst Tries: A Fast, Efficient Data Structure for String Keys," ACM Trans. Information Systems, vol. 20, no. 2, pp. 192-223, 2002.
[50] N. Askitis and R. Sinha, "HAT-Trie: A Cache-Conscious Trie-Based Data Structure for Strings," Proc. 30th Australasian Conf. Computer Science (ACSC '07), vol. 62, pp. 97-105, 2007.
[51] M. Cameron and H. Williams, "Comparing Compressed Sequences for Faster Nucleotide BLAST Searches," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 3, pp. 349-360, July-Sept. 2007.
[52] G. Navarro and V. Mäkinen, "Compressed Full-Text Indexes," ACM Computing Surveys, vol. 39, no. 1, pp. 1-61, 2007.

