Issue No.02 - February (2003 vol.14)
<p><b>Abstract</b>—In most parallel supercomputers, submitting a job for execution involves specifying how many processors are to be allocated to the job. When the job is moldable (i.e., there is a choice on how many processors the job uses), an application scheduler called <b>SA</b> can significantly improve job performance by automatically selecting how many processors to use. Since most jobs are moldable, this result has great impact to the current state of practice in supercomputer scheduling. However, the widespread use of <b>SA</b> can change the nature of workload processed by supercomputers. When many <b>SA</b>s are scheduling jobs on one supercomputer, the decision made by one <b>SA</b> affects the state of the system, therefore impacting other instances of <b>SA</b>. In this case, the global behavior of the system comes from the <it>aggregate behavior</it> caused by all <b>SA</b>s. In particular, it is reasonable to expect the competition for resources to become tougher with multiple <b>SA</b>s, and this tough competition to decrease the performance improvement attained by each <b>SA</b> individually. This paper investigates this very issue. We found that the increased competition indeed makes it harder for each individual instance of <b>SA</b> to improve job performance. Nevertheless, there are two other aggregate behaviors that override increased competition when the system load is moderate to heavy. First, as load goes up, <b>SA</b> chooses smaller requests, which increases efficiency, which effectively decreases the offered load, which mitigates long wait times. Second, better job packing and fewer jobs in the system make it easier for incoming jobs to fit in the supercomputer schedule, thus reducing wait times further. As a result, in moderate to heavy load conditions, a single instance of <b>SA</b> benefits from the fact that other jobs are also using <b>SA</b>.</p>
Parallel supercomputers, space-shared supercomputers, job scheduling, application scheduling, aggregate behavior.
Walfredo Cirne, "When the Herd Is Smart: Aggregate Behavior in the Selection of Job Request", IEEE Transactions on Parallel & Distributed Systems, vol.14, no. 2, pp. 181-192, February 2003, doi:10.1109/TPDS.2003.1178881