Parallel and Distributed Processing Symposium, International (2012)
Shanghai, China China
May 21, 2012 to May 25, 2012
Simultaneous multithreading (SMT) increases CPU utilization and application performance in many circumstances, but it can be detrimental when performance is limited by application scalability or when there is significant contention for CPU resources. This paper describes an SMT-selection metric that predicts the change in application performance when the SMT level and number of application threads are varied. This metric is obtained online through hardware performance counters with little overhead, and allows the application or operating system to dynamically choose the best SMT level. We have validated the SMT-selection metric using a variety of benchmarks that capture various application characteristics on two different processor architectures. Our results show that the SMT-selection metric is capable of predicting the best SMT level for a given workload in 90% of the cases. The paper also shows that such a metric can be used with a scheduler or application optimizer to help guide its optimization decisions.
Benchmark testing, Measurement, Hardware, Instruction sets, Context, Scalability, Pipelines, Operating Systems, Performance Optimization, SMT
P. Pattnaik, J. Jann, K. El Maghraoui, J. R. Funston and A. Fedorova, "An SMT-Selection Metric to Improve Multithreaded Applications' Performance," Parallel and Distributed Processing Symposium, International(IPDPS), Shanghai, China China, 2012, pp. 1388-1399.