2017 IEEE International Conference on Cluster Computing (CLUSTER) (2017)
Honolulu, Hawaii, United States
Sept. 5, 2017 to Sept. 8, 2017
Job runtime estimates provided by users are widely acknowledged to be overestimated and runtime overestimation can greatly degrade job scheduling performance. Previous studies focus on improving accuracy of job runtime estimates by reducing runtime overestimation, but fail to address the underestimation problem (i.e., the underestimation of job runtimes). Using an underestimated runtime is catastrophic to a job as the job will be killed by the scheduler before completion. We argue that both the improvement of runtime accuracy and the reduction of underestimation rate are equally important. To address this problem, we propose an online runtime adjustment framework called TRIP. TRIP explores the data censoring capability of the Tobit model to improve prediction accuracy while keeping a low underestimation rate of job runtimes. TRIP can be used as a plugin to job scheduler for improving job runtime estimates and hence boosting job scheduling performance. Preliminary results demonstrate that TRIP is capable of achieving high accuracy of 80% and low underestimation rate of 5%. This is significant as compared to other well-known machine learning methods such as SVM, Random Forest, and Last-2 which result in a high underestimation rate (20%-50%). Our experiments further quantify the amount of scheduling performance gain achieved by the use of TRIP.
Runtime, Support vector machines, Scheduling, Estimation, Data models, Processor scheduling, Mathematical model
Y. Fan, P. Rich, W. E. Allcock, M. E. Papka and Z. Lan, "Trade-Off Between Prediction Accuracy and Underestimation Rate in Job Runtime Estimates," 2017 IEEE International Conference on Cluster Computing (CLUSTER), Honolulu, Hawaii, United States, 2017, pp. 530-540.