The Community for Technology Leaders
RSS Icon
Issue No.01 - Jan.-March (2012 vol.5)
pp: 45-58
Xubo Fei , Wayne State University, Detroit
Shiyong Lu , Wayne State University, Detroit
Scientific workflow has recently become an enabling technology to automate and speed up the scientific discovery process. Although several scientific workflow management systems (SWFMSs) have been developed, a formal scientific workflow composition model in which workflow constructs are fully compositional one with another is still missing. In this paper, we propose a dataflow-based scientific workflow composition framework consisting of 1) a dataflow-based scientific workflow model that separates the declaration of the workflow interface from the definition of its functional body; 2) a set of workflow constructs, including Map, Reduce, Tree, Loop, Conditional, and Curry, which are fully compositional one with another; 3) a dataflow-based exception handling approach to support hierarchical exception propagation and user-defined exception handling. Our workflow composition framework is unique in that workflows are the only operands for composition; in this way, our approach elegantly solves the two-world problem in existing composition frameworks, in which composition needs to deal with both the world of tasks and the world of workflows. The proposed framework is implemented and several case studies are conducted to validate our techniques.
Scientific workflow, scientific workflow model, workflow composition, MapReduce, VIEW.
Xubo Fei, Shiyong Lu, "A Dataflow-Based Scientific Workflow Composition Framework", IEEE Transactions on Services Computing, vol.5, no. 1, pp. 45-58, Jan.-March 2012, doi:10.1109/TSC.2010.58
[1] C. Lin, S. Lu, X. Fei, A. Chebotko, Z. Lai, D. Pai, F. Fotouhi, and J. Hua, "A Reference Architecture for Scientific Workflow Management Systems and the VIEW SOA Solution," IEEE Trans. Services Computing, vol. 2, no. 1, pp. 79-92, Jan.-Mar. 2009.
[2] T. Andrews, F. Curbera, H. Dholakia, Y. Goland, J. Klein, F. Leymann, K. Liu, D. Roller, D. Smith, S. Thatte, I. Trickovic, and S. Weerawarana, "Business Process Execution Language for Web Services, Version 1.1," specificationws-bpel, 2003.
[3] W. van der Aalst and A. ter Hofstede, "YAWL: Yet Another Workflow Language," Information Systems, vol. 30, no. 4, pp. 245-275, 2005.
[4] B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E.A. Lee, J. Tao, and Y. Zhao, "Scientific Workflow Management and the Kepler System," Concurrency and Computation: Practice and Experience, vol. 18, no. 10, pp. 1039-1065, 2006.
[5] T. Oinn, M.J. Addis, J. Ferris, D. Marvin, M. Senger, T. Carver, M. Greenwood, K. Glover, M. Pocock, A. Wipat, and P. Li, "Taverna: A Tool for the Composition and Enactment of Bioinformatics Workflows," Bioinformatics, vol. 20, no. 17, pp. 3045-3054, 2004.
[6] S. Callahan, J. Freire, E. Santos, C. Scheidegger, C. Silva, and H. Vo, "VisTrails: Visualization Meets Data Management," Proc. ACM Int'l Conf. Management of data SIGMOD, pp. 745-747, 2006.
[7] I. Taylor, E. Deelman, D. Gannon, and M. Shields, Workflows for e-Science: Scientific Workflows for Grids. Springer-Verlag, 2007.
[8] W. van der Aalst and K. van Hee, Workflow Management: Models, Methods, and Systems. MIT, 2002.
[9] C. Lin, S. Lu, Z. Lai, A. Chebotko, X. Fei, J. Hua, and F. Fotouhi, "Service-Oriented Architecture for VIEW: A Visual Scientific Workflow Management System," Proc. IEEE Int'l Conf. Services Computing (SCC), pp. 335-342, 2008.
[10] F. DeRemer and H. Kron, "Programming-in-the-Large versus Programming-in-the-Small," Proc. Fachtagung über Programmiersprachen, pp. 80-89, 1976.
[11] M. Gorlick and A. Quilici, "Visual Programming-in-the-Large versus Visual Programming-in-the-Small," Proc. IEEE Symp. Visual Languages, pp. 137-144, 1994.
[12] W. Johnston, J. Hanna, and R. Millar, "Advances in Dataflow Programming Languages," ACM Computing Surveys, vol. 36, no. 1, pp. 1-34, 2004.
[13] S. Bowers, B. Ludäscher, A. Ngu, and T. Critchlow, "Enabling Scientific Workflow Reuse through Structured Composition of Dataflow and Control-Flow," Proc. 22nd Int'l Conf. Data Eng. Workshops, vol. 0, p. 70, 2006.
[14] Y. Simmhan, B. Plale, and D. Gannon, "Karma2: Provenance Management for Data-Driven Workflows." Int'l J. of Web Services Research, vol. 5, no. 2, pp. 1-22, 2008.
[15] S. Lu and J. Zhang, "Collaborative Scientific Workflows," Proc. IEEE Int'l Conf. in Web Services (ICWS), pp. 527-534, 2009.
[16] L. Moreau, J. Freire, J. Futrelle, R. Mcgrath, J. Myers, and P. Paulson, "The Open Provenance Model: An Overview," Proc. Provenance and Annotation of Data and Processes, pp. 323-326, 2008.
[17] A. Chervenak, R. Schuler, C. Kesselman, S. Koranda, and B. Moe, "Wide Area Data Replication for Scientific Collaborations," Proc. IEEE/ACM Sixth Int'l Workshop Grid Computing, pp. 1-8, 2005.
[18] J. Gray, D. Liu, M. Nieto-Santisteban, A. Szalay, D. DeWitt, and G. Heber, "Scientific Data Management in the Coming Decade," ACM SIGMOD Record, vol. 34, no. 4, pp. 34-41, 2005.
[19] E. Deelman and A. Chervenak, "Data Management Challenges of Data-Intensive Scientific Workflows," Proc. IEEE Eighth Int'l Symp. Cluster Computing and the Grid (CCGRID), pp. 687-692, 2008.
[20] M. Adams, A. ter Hofstede, D. Edmond, and W. van der Aalst, "Facilitating Flexibility and Dynamic Exception Handling in Workflows through Worklets," Proc. 17th Int'l Conf. Advanced Information Systems Eng. (CAiSE), pp. 45-50, 2005.
[21] N. Russell, W. van der Aalst, and A. ter Hofstede, "Workflow Exception Patterns," Proc. Advanced Information Systems Eng. (CAiSE), pp. 288-302, 2006.
[22] C. Hagen and G. Alonso, "Exception Handling in Workflow Management Systems," IEEE Trans. Software Eng., vol. 26, no. 10, pp. 943-958, Oct. 2000.
[23] C. Lin, S. Lu, X. Fei, D. Pai, and D. Hua, "A Task Abstraction and Mapping Approach to the Shimming Problem in Scientific Workflows," Proc. IEEE Int'l Conf. Services Computing (SCC), pp. 284-291, 2009.
[24] J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Proc. Sixth Conf. Opearting Systems Design and Implementation, pp. 137-150, 2004.
[25] D. Shukla and B. Schmidt, Essential Windows Workflow Foundation. Addison-Wesley Pearson Education, 2007.
[26] OASIS, "Web Services Business Process Execution Language Version 2.0," wsbpel-v2.0-OS.html, 2011.
[27] X. Fei, S. Lu, and C. Lin, "A MapReduce-Enabled Scientific Workflow Composition Framework," Proc. IEEE Int'l Conf. Web Services (ICWS), pp. 663-670, 2009.
[28] "Amazon Elastic MapReduce," elasticmapreduce , 2011.
[29] "Introduction to Amazon Elastic MapReduce," http:// elastic-mapreduce.pdf , 2011.
[30] J. Ram, C. Müller, M. Beckmann, and J. Hardege, "The Spawning Pheromone Cysteine-Glutathione Disulfide ('Nereithione') Arouses a Momponent Nuptial Behaviour and Electrophysiological Activity in Nereis Succinea Males," J. Official Publication of the Federation of Am. Soc. for Experimental Biology (FASEB), vol. 13, pp. 945-952, 1999.
[31] X. Fei, S. Lu, T. Breithaupt, J. Hardege, and J. Ram, "Modeling Matefinding Behavior of the Swarming Polychaete, Nereis Succinea, with TangoInSilico, a Scientific Workflow Based Simulation System for Sexual Searching," Invertebrate Reproduction and Development, vol. 52, nos. 1/2, pp. 69-80, 2008.
[32] P. Wong and J. Gibbons, "A Process-Algebraic Approach to Workflow Specification and Refinement," Proc. Sixth Int'l Conf. Software Composition, pp. 51-65, 2007.
[33] D. Martin, M. Burstein, J. Hobbs, O. Lassila, D. McDermott, S. McIlraith, S. Narayanan, M. Paolucci, B. Parsia, T. Payne, E. Sirin, N. Srinivasan, and K. Sycara, "OWL-S: Semantic Markup for Web Services,", 2011.
[34] D. Roman, H. Lausen, U. Keller, U. Oren, C. Bussler, M. Kifer, and D. Fensel, "Web Service Modeling Ontology (WSMO),", 2011.
[35] C. Fritz, R. Hull, and J. Su, "Automatic Construction of Simple Artifact-Based Business Processes," Proc. 12th Int'l Conf. Database Theory (ICDT), pp. 225-238, 2009.
[36] I. Wassink, H. Rauwerda, P. van der Vet, T. Breit, and A. Nijholt, "E-BioFlow: Different Perspectives on Scientific Workflows," Proc. Bioinformatics Research and Development (BIRD), pp. 243-257, 2008.
[37] A. Slominski, "Adapting BPEL to Scientific Workflows," Proc. Workflows for e-Science: Scientific Workflows for Grids, pp. 208-226, 2007.
[38] Y. Zhao, M. Wilde, and I. Foster, "Virtual Data Language: A Typed Workflow Notation for Diversely Structured Scientific Data," Proc. Workflows for e-Science: Scientific Workflows for Grids, pp. 258-275, 2007.
[39] E. Deelman, "Looking into the Future of Workflows: The Challenges Ahead," Workflows for e-Science: Scientific Workflows for Grids, 2007.
[40] D. Turi, P. Missier, C. Goble, D. Roure, and T. Oinn, "Taverna Workflows: Syntax and Semantics," Proc. IEEE Third Int'l Conf. e-Science and Grid Computing, pp. 441-448, 2007.
[41] T. Glatard, J. Montagnat, D. Lingrand, and X. Pennec, "Flexible and Efficient Workflow Deployment of Data-Intensive Applications on Grids with MOTEUR," Int'l J. High Performance Computing Applications, vol. 22, no. 3, pp. 347-360, 2008.
[42] E. Deelman, G. Singh, M. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, B. Berriman, J. Good, A. Laity, J. Jacob, and D. Katz, "Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems," J. Scientific Programming, vol. 13, no. 3, pp. 219-237, 2005.
[43] Y. Zhao, J. Dobson, I. Foster, L. Moreau, and M. Wilde, "A Notation and System for Expressing and Executing Cleanly Typed Workflows on Messy Scientific Data," ACM SIGMOD Record, vol. 34, no. 3, pp. 37-43, 2005.
[44] D. Churches, G. Gombas, A. Harrison, J. Maassen, C. Robinson, M. Shields, I. Taylor, and I. Wang, "Programming Scientific and Distributed Workflow with Triana Services," Concurrency and Computation: Practice and Experience, vol. 18, no. 10, pp. 1021-1037, 2006.
[45] D. Goodman, "Introduction and Evaluation of Martlet: A Scientific Workflow Language for Abstracted Parallelisation," Proc. 16th Int'l Conf. World Wide Web, pp. 983-992, 2007.
[46] L. Zhang and Q. Zhou, "CCOA: Cloud Computing Open Architecture," Proc. IEEE Int'l Conf. Web Services (ICWS), pp. 607-616, 2009.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool