Cluster Computing and the Grid, IEEE International Symposium on (2007)
Rio De Janeiro, Brazil
May 14, 2007 to May 17, 2007
J. L. Vazquez-Poletti , Universidad Complutense de Madrid, Spain
E. Huedo , Universidad Complutense de Madrid, Spain
R. S. Montero , Universidad Complutense de Madrid, Spain
I. M. Llorente , Universidad Complutense de Madrid, Spain
Bioinformatics is demanding more computational resources day after day. The problems proposed by this area are growing in such complexity that traditional computing systems are not able to face them. For solving complex problems which can be divided in tasks with dependencies, a workflow management system must be employed. In this paper, we introduce the use of the workflow management of the GridWay metascheduler for running a Bioinformatics application which implements a complex algorithm performing protein clustering in order to obtain non-redundant protein databases. The use of a general purpose meta-scheduling system will provide the application the fault-tolerance and advance scheduling capabilities needed to execute on a highly dynamic, heterogeneous and faulty environment. The execution results on a production Grid (the EGEE infrastructure) shows the dramatic impact of remote queue waiting times on the application performance; and the critical need of efficient re-scheduling capabilities.
I. M. Llorente, J. L. Vazquez-Poletti, E. Huedo and R. S. Montero, "Workflow Management in a Protein Clustering Application," Cluster Computing and the Grid, IEEE International Symposium on(CCGRID), Rio De Janeiro, Brazil, 2007, pp. 679-684.