This Article 
 Bibliographic References 
 Add to: 
Performance-Driven Processor Allocation
July 2005 (vol. 16 no. 7)
pp. 599-611

Abstract—In current multiprogrammed multiprocessor systems, to take into account the performance of parallel applications is critical to decide an efficient processor allocation. In this paper, we present the Performance-Driven Processor Allocation policy (PDPA). PDPA is a new scheduling policy that implements a processor allocation policy and a multiprogramming-level policy, in a coordinated way, based on the measured application performance. With regard to the processor allocation, PDPA is a dynamic policy that allocates to applications the maximum number of processors to reach a given target efficiency. With regard to the multiprogramming level, PDPA allows the execution of a new application when free processors are available and the allocation of all the running applications is stable, or if some applications show bad performance. Results demonstrate that PDPA automatically adjusts the processor allocation of parallel applications to reach the specified target efficiency, and that it adjusts the multiprogramming level to the workload characteristics. PDPA is able to adjust the processor allocation and the multiprogramming level without human intervention, which is a desirable property for self-configurable systems, resulting in a better individual application response time.

[1] T.B. Brecht and K. Guha, “Using Parallel Program Characteristics in Dynamic Processor Allocation,” Performance Evaluation, nos. 27-28, pp. 519-539, 1996.
[2] J. Corbalan, X. Martorell, and J. Labarta, “Dynamic Performance Analysis: SelfAnalyzer,” Technical Report UPC-DAC-2002-54,, 2004.
[3] S.-H. Chiang, R.K. Mansharamani, and M.K. Vernon, “Use of Application Characteristics and Limited Preemption for Run-to-Completion Parallel Processor Scheduling Policies,” Proc. ACM SIGMETRICS Conf., pp. 33-44, May 1994.
[4] D.L. Eager, J. Zahorjan, and E.D. Lawoska, “Speedup versus Efficiency in Parallel Systems,” IEEE Trans. Computers, vol. 38, no. 3, pp. 408-423, Mar. 1989.
[5] F. Freitag, J. Corbalan, and J. Labarta, “A Dynamic Periodicity Detector: Application to Speedup Computation,” Proc. 15th Int'l Parallel and Distributed Processing Symp. (IPDPS 2001), pp. 2-8, Apr. 2001.
[6] H. Jin, M. Frumkin, and J. Yan, “The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance,” Technical Report: NAS-99-011, 1999.
[7] B. Hamidzadeh and D.J. Lilja, “Self-Adjusting Scheduling: An On-Line Optimization Technique for Locality Management and Load Balancing,” Proc. Int'l Conf. Parallel Processing, vol. 2, pp. 39-46, 1994.
[8] J. Labarta, S. Girona, V. Pillet, T. Cortes, and L. Gregoris, “DiP : A Parallel Program Development Environment,” Proc. Second Int'l EuroPar Conf., Aug. 1996.
[9] S.T. Leutenegger and M.K. Vernon, “The Performance of Multiprogrammed Multiprocessor Scheduling Policies,” Proc. ACM SIGMETRICS Conf., pp. 226-236, May 1990.
[10] S. Majumdar, D.L. Eager, and R.B. Bunt, “Characterisation of Programs for Scheduling in Multiprogrammed Parallel Systems,” Performance Evaluation, vol. 13, pp. 109-130, 1991.
[11] X. Martorell, “Dynamic Scheduling of Parallel Applications on Shared-Memory Multiprocessors,” PhD thesis, Technical Univ. of Catalonia (UPC), July 1999.
[12] C. McCann, R. Vaswani, and J. Zahorjan, “A Dynamic Processor Allocation Policy for Multiprogrammed Shared-Memory Multiprocessors,” ACM Trans. Computer Systems, vol. 11, no. 2, pp. 146-178, May 1993.
[13] X. Martorell, J. Labarta, N. Navarro, and E. Ayguade, “Nano-Threads Library Design, Implementation and Evaluation,” Technical Report: UPC-DAC-1995-33, Dept. d'Arquitectura de Computadors-UPC, Sept. 1995.
[14] X. Martorell, J. Labarta, N. Navarro, and E. Ayguade, “A Library Implementation of the Nano-Threads Programming Model,” Proc. Second Int'l Euro-Par Conf., vol. 2, pp. 644-649, Aug. 1996.
[15] T.D. Nguyen, J. Zahorjan, and R. Vaswani, “Parallel Application Characterization for Multiprocessor Scheduling Policy Design,” Proc. Workshop Job Scheduling Strategies for Parallel Processing, 1996.
[16] T.D. Nguyen, J. Zahorjan, and R. Vaswani, “Using Runtime Measured Workload Characteristics in Parallel Processors Scheduling,” Proc. Workshop Job Scheduling Strategies for Parallel Processing, 1996.
[17] E.W. Parsons and K.C. Sevcik, “Benefits of Speedup Knowledge in Memory-Constrained Multiprocessor Scheduling,” Performance Evaluation, nos. 27-28, pp. 253-272, 1996.
[18] K.C. Sevcik, “Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems,” Performance Evaluation, vol. 19, nos. 1/3, pp. 107-140, Mar. 1994.
[19] K.C. Sevcik, “Characterization of Parallelism in Applications and Their Use in Scheduling,” Proc. ACM SIGMETRICS Conf., pp. 171-180, May 1989.
[20] A. Serra, N. Navarro, and T. Cortes, “DITools: Application-Level Support for Dynamic Extension and Flexible Composition,” Proc. USENIX Ann. Technical Conf., pp. 225-238, June 2000.
[21] Standard Performance Evaluation Corp., SPEC CPU95 Benchmarks,, 1995.
[22] The Standard Workload Format, swf.html, 2004.
[23] M.J. Voss and R. Eigenmann, “Reducing Parallel Overheads through Dynamic Serialization,” Proc. 13th Int'l Parallel and Distributed Processing Symp., pp. 88-92, 1999.
[24] J.B. Weissman, L.R. Abburi, and D. England, “Integrated Scheduling: The Best of Both Worlds,” J. Parallel and Distributed Computing, vol. 63, pp. 649-668, 2003.
[25] Workload logs,, 2004.

Index Terms:
Operating system algorithms, multiprocessor scheduling, runtime analysis, performance analysis, OpenMP.
Julita Corbalan, Xavier Martorell, Jesus Labarta, "Performance-Driven Processor Allocation," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 7, pp. 599-611, July 2005, doi:10.1109/TPDS.2005.85
Usage of this product signifies your acceptance of the Terms of Use.