The Community for Technology Leaders
2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (2016)
Haifa, Israel
Sept. 11, 2016 to Sept. 15, 2016
ISBN: 978-1-5090-5308-7
pp: 287-297
Yong Zhao , The University of Texas at Arlington, United States
Jia Rao , The University of Texas at Arlington, United States
Qing Yi , The University of Colorado Colorado Springs, United States
As virtualization becomes ubiquitous in datacenters, there is a growing interest in characterizing application performance in multi-tenant environments to improve datacenter resource management. The performance of parallel programs is notoriously difficult to reason about in virtualized environments. Although performance degradations caused by virtualization and interferences have been extensively studied, there still lacks a comprehensive understanding why parallel programs have unpredictable slowdowns when co-located with different types of workloads. This paper presents a systematic and quantitative study of multithreaded performance under interference. We design synthetic workloads to emulate different types of interference and study the behavior of parallel programs under such interferences. We find that unpredictable performance is the result of complex interplays between the design of the program, the memory hierarchy of the host system, and the CPU scheduling at the hypervisor. To understand the intricate relationships between multiple factors, we decompose parallel runtime into compute, synchronization and steal time, and use the runtime breakdown to measure program progress and identify execution inefficiency under interference. Based on these findings, we develop an online approach to predicting performance slowdown without requiring parallel programs to be completed, and devise two scheduling optimizations at the hypervisor to reduce slowdowns. Experimental results with Xen and representative parallel workloads show that the online performance prediction achieves on average less than 4.5% error and the optimizations reduce runtime slowdown by as much as 38% compared to stock Xen.
Interference, Synchronization, Instruction sets, Runtime, Virtual machine monitors, Hardware, Spinning

Y. Zhao, J. Rao and Q. Yi, "Characterizing and optimizing the performance of multithreaded programs under interference," 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT), Haifa, Israel, 2016, pp. 287-297.
80 ms
(Ver 3.3 (11022016))