Bioinformatic sequence data is typically analyzed via a pipeline of tools which may be realized manually or through some kind of script or workflow system. The explosive increase in the number of genomes available has made single sequence analyses almost obsolete. Bioinformaticians now wish to compare and analyze multiple versions of similar sequences, and the greater statistical significance afforded by automated comparisons is vital to scientific investigation.This work describes recent extensions to the GPFlow scientific workflow system [1] in development at MQUTeR (www.mquter.qut.edu.au), which facilitate interactive experimentation, automatic lifting of computations from single-case to collection-oriented computation and automatic correlation and synthesis of collections. A GPFlow workflow presents as an acyclic data flow graph, yet provides powerful iteration and collection formation capabilities.
Index Terms:
eResearch, workflow
Citation:
Lawrence Buckingham, James M. Hogan, Paul Roe, Jiro Sumitomo, Michael Towsey, "Comparative Studies Made Simple in GPFlow," ccgrid, pp.699, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), 2008