This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
LOMARC: Lookahead Matchmaking for Multiresource Coscheduling on Hyperthreaded CPUs
November 2006 (vol. 17 no. 11)
pp. 1360-1375

Abstract—Job scheduling typically focuses on the CPU with little work existing to include I/O or memory. Time-shared execution provides the chance to hide I/O and long-communication latencies though potentially creating a memory conflict. Hyperthreaded CPUs support coscheduling without any context switches and provide additional options for CPU-internal resource sharing. We present an approach that includes all possible resources into the schedule optimization and improves utilization by coscheduling two jobs if feasible. Our LOMARC approach partially reorders the queue by lookahead to increase the potential to find good matches. In simulations based on the workload model of Lublin and Feitelson [CHECK END OF SENTENCE], we have obtained improvements between 30 percent and 50 percent in both response times and relative bounded response times on hyperthreaded CPUs (i.e., cut times to two third or to half).

[1] D. Feitelson, “Job Scheduling in Multiprogrammed Parallel Systems, Extended Version,” IBM, technical report, RC 19790 (87657), Aug. 1997.
[2] J. Moreira, W. Chan, L. Fong, H. Franke, and M. Jette, “An Infrastructure for Efficient Parallel Job Execution in Terascale Computing Environments,” Proc. ACM/IEEE Supercomputing Conf. (SC), Nov. 1998.
[3] A. Dusseau, R. Arpaci, and D. Culler, “Implicit Scheduling—Efficient Distributed Scheduling for Parallel Workloads on Networks of Workstations,” Proc. SIGMETRICS Conf. Measurement and Modelling of Computer Systems, 1996.
[4] P. Sobalvarro, S. Pakin, W. Weihl, and A. Chien, “Dynamic Coscheduling on Workstation Clusters,” Proc. Workshop Job Scheduling Strategies for Parallel Processing (JSSPP), 1998.
[5] S. Nagar, A. Banerjee, A. Sivasubramaniam, and C. Das, “A Closer Look at Coscheduling Approaches for a Network of Workstations,” Proc. ACM Symp. Parallel Algorithms and Architectures (SPAA), 1999.
[6] Y. Zhang, A. Sivasubramaniam, J. Moreira, and H. Franke, “A Simulation-Based Study of Scheduling Mechanisms for a Dynamic Cluster Environment,” Proc. Int'l Conf. Supercomputing (ICS), 2000.
[7] A. Sodan, “Loosely Coordinated Coscheduling in the Context of Other Dynamic Job Scheduling Approaches—A Survey,” Concurrency & Computation: Practice & Experience, vol. 17, no. 15, pp.1725-1781, Dec. 2005.
[8] F. da Silva and I. Scherson, “Concurrent Gang: Towards a Flexible and Scalable Gang Scheduler,” Proc. 11th Symp. Computer Architecture and High Performance Computing, 1999.
[9] Y. Zhou and A. Sodan, “Survey of Zero-Copy Optimization in User-Level Communication and Adaptive Knowledge-Based Solution,” Proc. Conf. High Performance Computing Systems (HPCS), 2004.
[10] U. Lublin and D. Feitelson, “The Workload on Parallel Supercomputers—Modelling the Characteristics of Rigid Jobs,” J.Parallel and Distributed Computing, vol. 63, no. 11, pp. 1105-1122, Nov. 2003.
[11] A. Sodan and L. Lan, “LOMARC—Lookahead Matchmaking in Multi-Resource Coscheduling,” Proc. Workshop Job Scheduling Strategies for Parallel Processing (JSSPP), pp. 288-315, 2004.
[12] J. Ousterhout, “Scheduling Techniques for Concurrent Systems,” Proc. Third Int'l Conf. Distributed Computing Systems, 1982.
[13] A. Sodan and X. Huang, “Adaptive Time/Space Sharing with SCOJO,” Proc. Conf. High Performance Computing Systems (HPCS), 2004.
[14] A. Sodan and M. Riyadh, “Coscheduling of MPI and Adaptive Thread Applications in a Solaris Environment,” Proc. IASTED Int'l Conf. Parallel and Distributed Computing and Systems (PDCS), 2002.
[15] E. Frachtenberg, D. Feitelson, F. Petrini, and J. Fernandez, “Flexible Coscheduling—Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources,” Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), 2003.
[16] Y. Wiseman and D. Feitelson, “Paired Gang Scheduling,” IEEE Trans. Parallel and Distributed Systems, vol. 14, no. 6, pp. 581-592, June 2003.
[17] E. Shmueli and D. Feitelson, “Backfilling with Lookahead to Optimize the Performance of Parallel Job Scheduling,” Proc. Workshop Job Scheduling Strategies for Parallel Processing (JSSPP), 2003.
[18] D. Talby and D. Feitelson, “Supporting Priorities and Improving Utilization of the IBM SP2 Scheduler Using Slack-Based Backfilling,” Proc. Int'l Symp. Parallel Processing (IPPS), 1999.
[19] S. Setia, M. Squillante, and V.V.K. Naik, “The Impact of Job Memory Requirements on Gang-Scheduling Performance,” Performance Evaluation Rev., Mar. 1999.
[20] A. Batat and D. Feitelson, “Gang Scheduling with Memory Considerations,” Proc. Int'l Symp. Parallel and Distributed Processing (IPDPS), 2000.
[21] W. Leinberger, G. Karypis, and V. Kumar, “Job Scheduling in the Presence of Multiple Resource Requirements,” Proc. IEEE/ACM Supercomputing Conf. (SC), 1999.
[22] D. Tullsen, S. Eggers, and H. Levy, “Simultaneous ultithreading—Maximizing On-Chip Parallelism,” Proc. Ann. Int'l Symp. Computer Architecture (ISCA), 1995.
[23] D. Marr, F. Binns, D. Hill, G. Hinton, D. Koufaty, J. Miller, and M. Upton, “Hyper-Threading Technology Architecture and Microarchitecture,” Intel Technology J. Q1, vol. 6, no. 1, 2002.
[24] T. Leng, R. Ali, J. Hsieh, V. Mashayekhi, and R. Rooholamini, “An Empirical Study of Hyper-Threading in High Performance Computing Clusters,” Linux HPC Revolution, 2002.
[25] W. Magro, P. Peterson, and S. Shah, “Hyper-Threading Technology: Impact on Compute-Intensive Workloads,” Intel Technology J. Q1, vol. 6, no. 1, 2002.
[26] W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph, “Implications of I/O for Gang Scheduled Workloads,” Proc. Conf. Job Scheduling Strategies for Parallel Processing, 1997.
[27] S. Figueira and F. Berman, “A Slowdown Model for Applications Executing on Time-Shared Clusters of Workstations,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6, June 2001.
[28] R. Gibbons, “Historical Application Profiler for Use by Parallel Schedulers,” Proc. Workshop Job Scheduling Strategies for Parallel Processing (JSSPP), 1997.
[29] Y. Zhang, A. Sivasubramaniam, J. Moreira, and H. Franke, “Impact of Workload and System Parameters on Next Generation Cluster Scheduling Mechanism,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 9, Sept. 2001.
[30] D. Bailey, T. Harris, W. Saphir, R.V. der Wijngaart, A. Woo, and M. Yarrow, “The NAS Parallel Benchmarks 2.0,” NAS Technical Report NAS-95-020, NASA Ames Research Center, Moffett Field, Calif., 1995.
[31] S.-H. Chiang and M. Vernon, “Characteristics of a Large Shared Memory Production Workload,” Proc. Workshop Job Scheduling Strategies for Parallel Processing (JSSPP), 2001.
[32] A. Mu'alem and D. Feitelson, “Utilization, Predictability, Workloads and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling,” IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6, June 2001.
[33] “Lublin/Feitelson Workload Model, C Code,” http://www.cs.huji.ac. il/labs/parallel/ workloadmodels.html, Dec. 2005.
[34] A.C. Sodan and L. Liu, “Dynamic Multi-Resource Monitoring for Predictive Job Scheduling with ScoPro,” Proc. IASTED Int'l Conf. Parallel and Distributed Computing and Systems (PDCS), Nov. 2005.
[35] D. Nikolopoulos and C. Polychronopoulos, “Adaptive Scheduling Under Memory Pressure on Multiprogrammed SMPs,” Proc. Int'l Parallel and Distributed Processing Symp. (IPDPS), Apr. 2002.

Index Terms:
Distributed architecture, multiprocessing, scheduling, threads, performance measures.
Citation:
Angela C. Sodan, Lei Lan, "LOMARC: Lookahead Matchmaking for Multiresource Coscheduling on Hyperthreaded CPUs," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 11, pp. 1360-1375, Nov. 2006, doi:10.1109/TPDS.2006.160
Usage of this product signifies your acceptance of the Terms of Use.