2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (2016)
Chicago, IL, USA
May 23, 2016 to May 27, 2016
Scientific workflows have become the mainstream to conduct large-scale scientific research. In the meantime, cloud computing has emerged as an alternative computing paradigm. In this paper, we conduct an analysis of the performance of an I/O-intensive real scientific workflow on cloud environments using makespan (the turnaround time for a workflow to complete its execution) as the key performance metric. In particular, we assess the impact of varying the storage configurations on workflow performance when executing on Google Cloud and Amazon Web Services. We aim to understand the performance bottlenecks of the popular cloud-based execution environments. Experimental results show significant differences in application performance for different configurations. They also reveal that Amazon Web Services outperforms Google Cloud with equivalent application and system configurations. We then investigate the root cause of these results using provenance data and by benchmarking disk and network I/O on both infrastructures. Lastly, we also suggest modifications in the standard cloud storage APIs, which will reduce the makespan for I/O-intensive workflows.
Cloud computing, Google, Data transfer, Standards, Benchmark testing, Performance analysis
Hassan Nawaz, Gideon Juve, Rafael Ferreira Da Silva, Ewa Deelman, "Performance Analysis of an I/O-Intensive Workflow Executing on Google Cloud and Amazon Web Services", 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), vol. 00, no. , pp. 535-544, 2016, doi:10.1109/IPDPSW.2016.90