High Performance Computing and Grid in Asia Pacific Region, International Conference on (2005)
Nov. 30, 2005 to Dec. 3, 2005
Yang Wang , University of Alberta, Canada
Paul Lu , University of Alberta, Canada
<p>Traditional high-performance computing (HPC) systems have independent job schedulers and file systems that do not interact in substantial ways. We make the case that some integration of scheduler and file system can have three main benefits. First, the dataflow dependencies between the jobs in a workflow can be inferred by combining the scheduler?s knowledge of the jobs (and possibly control-flow) and the file system?s knowledge of the files accessed. Second, the dataflow information can be used to improve workflow instance concurrency when there are (potential) filename conflicts. Third, when workflows need to be re-computed, only the affected jobs need to be re-executed.</p> <p>We present the design and a simulation study of the Workflow-Aware File System (WaFS). Our design layers a Namespace Manager (NM) on top of existing file systems to provide, for example, a dataflow engine and a versioned file system. Our simulation study (with a specific set of application parameters) shows that a combined WaFSaware file system and scheduler can significantly improve makespans for intensive workloads and be efficient in the re-computation of jobs.</p>
P. Lu and Y. Wang, "On the Benefits of aWorkflow-Aware File System in High-Performance Computing Systems," High Performance Computing and Grid in Asia Pacific Region, International Conference on(HPCASIA), Beijing, China, 2005, pp. 227-234.