Parallel and Distributed Processing Symposium, International (2008)
Miami, FL, USA
Apr. 14, 2008 to Apr. 18, 2008
Katharina Benkert , Parallel Software Technologies Laboratory, Department of Computer Science, University of Houston, TX 77204-3010, USA
Edgar Gabriel , Parallel Software Technologies Laboratory, Department of Computer Science, University of Houston, TX 77204-3010, USA
Michael M. Resch , High Performance Computing Center Stuttgart (HLRS), Nobelstr. 19, 70569, Germany
When an adaptive software component is employed to select the best-performing implementation for a communication operation at runtime, the correctness of the decision taken strongly depends on detecting and removing outliers in the data used for the comparison. This automatic decision is greatly complicated by the fact that the types and quantities of outliers depend on the network interconnect and the nodes assigned to the job by the batch scheduler. This paper evaluates four different statistical methods used for handling outliers, namely a standard interquartile range method, a heuristic derived from the trimmed mean value, cluster analysis and a method using robust statistics. Using performance data from the Abstract Data and Communication Library (ADCL) we evaluate the correctness of the decisions made with each statistical approach over three fundamentally different network interconnects, namely a highly reliable InfiniBand network, a Gigabit Ethernet network having a larger variance in the performance, and a hierarchical Gigabit Ethernet network.
K. Benkert, M. M. Resch and E. Gabriel, "Outlier detection in performance data of parallel applications," 2008 IEEE International Parallel & Distributed Processing Symposium(IPDPS), Miami, FL, 2008, pp. 1-8.