This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Per-Thread Cycle Accounting
January/February 2010 (vol. 30 no. 1)
pp. 71-80
Stijn Eyerman, Ghent University
Lieven Eeckhout, Ghent University

Resource sharing unpredictably affects per-thread performance in multithreaded architectures, but system software assumes all coexecuting threads make equal progress. Per-thread cycle accounting addresses this problem by tracking per-thread progress rates for each coexecuting thread. This approach has the potential to improve quality of service (QoS), service-level agreements (SLA), performance predictability, service differentiation, and proportional-share performance on multithreaded architectures.

1. S. Eyerman and L. Eeckhout, "Per-Thread Cycle Accounting in SMT Processors," Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM Press, 2009, pp. 133-144.
2. S. Eyerman et al., "A Performance Counter Architecture for Computing Accurate CPI Components," Proc. 12th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM Press, 2006, pp. 175-184.
3. T. Karkhanis and J.E. Smith, "A First-Order Superscalar Processor Model," Proc. 31st Ann. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 2004, pp. 338-349.
4. Y. Chou, B. Fahs, and S. Abraham, "Microarchitecture Optimizations for Exploiting Memory-Level Parallelism," Proc. 31st Ann. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 2004, pp. 76-87.
5. M.K. Qureshi et al., "A Case for MLP-Aware Cache Replacement," Proc. 33rd Ann. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 2006, pp. 167-177.
6. D.M. Tullsen et al., "Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor," Proc. 23rd Ann. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 1996, pp. 191-202.
7. D.M. Tullsen and J.A. Brown, "Handling Long-Latency Loads in a Simultaneous Multithreading Processor," Proc. 34th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), IEEE CS Press, 2001, pp. 318-327.
8. S. Eyerman and L. Eeckhout, "A Memory-Level Parallelism Aware Fetch Policy for SMT Processors," Proc. Int'l Symp. High-Performance Computer Architecture (HPCA), IEEE CS Press, 2007, pp. 240-249.
9. F.J. Cazorla et al., "QoS for High-Performance SMT Processors in Embedded Systems," IEEE Micro, vol. 24, no. 4, 2004, pp. 24-31.
10. S. Choi and D. Yeung, "Learning-Based SMT Processor Resource Distribution via Hill-Climbing," Proc. 33rd Ann. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 2006, pp. 239-250.
11. A. Snavely and D.M. Tullsen, "Symbiotic Job Scheduling for Simultaneous Multithreading Processor," Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM Press, 2000, pp. 234-244.
1. A. Mericas, "Performance Monitoring on the POWER5 Microprocessor," Performance Evaluation and Benchmarking, L.K. John, and L. Eeckhout eds., CRC Press, 2006, pp. 247-266.
2. S. Eyerman et al., "A Performance Counter Architecture for Computing Accurate CPI Components," , Proc. 12th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM Press, 2006, pp. 175-184.
3. J. Dean et al., "ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors," Proc. 30th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), IEEE CS Press, 1997, pp. 292-302.
4. B.A. Fields et al., "Interaction Cost and Shotgun Profiling," ACM Trans. Architecture and Code Optimization, vol. 1, no. 3, 2004, pp. 272-304.
5. S.E. Raasch and S.K. Reinhardt, "The Impact of Resource Partitioning on SMT Processors," Proc. 12th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), IEEE CS Press, 2003, pp. 15-26.
6. D.M. Tullsen et al., "Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor," Proc. 23rd Ann. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 1996, pp. 191-202.
7. D.M. Tullsen and J.A. Brown, "Handling Long-Latency Loads in a Simultaneous Multithreading Processor," Proc. 34th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), IEEE CS Press, 2001, pp. 318-327.
8. S. Eyerman and L. Eeckhout, "A Memory-Level Parallelism Aware Fetch Policy for SMT Processors," Proc. Int'l Symp. High-Performance Computer Architecture (HPCA), IEEE CS Press, 2007, pp. 240-249.
9. F.J. Cazorla et al., "Dynamically Controlled Resource Allocation in SMT Processors," Proc. 37th Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO), IEEE CS Press, 2004, pp. 171-182.
10. J. Emer, "EV8: The Post-Ultimate Alpha," keynote presentation at the Int'l Conf. on Parallel Architectures and Compilation Techniques (PACT), 2001; http://research.ac.upc.edu/pact01/keynotes emer.pdf.
11. R. Gabor, S. Weiss, and A. Mendelson, "Fairness Enforcement in Switch on Event Multithreading," ACM Trans. Architecture and Code Optimization (TACO), vol. 4, no. 3, 2007, p. 34.
12. F.J. Cazorla et al., "Predictable Performance in SMT Processors: Synergy between the OS and SMTs," IEEE Trans. Computers, vol. 55, no. 7, 2006, pp. 785-799.
13. E. Cota-Robles, Priority Based Simultaneous Multithreading, US Patent No. 6,658,447 B2, Patent and Trademark Office, 2003.
14. C. Boneti et al., "Software-Controlled Priority Characterization of Power5 Processor," Proc. Int'l Symp. Computer Architecture (ISCA), IEEE CS Press, 2008, pp. 415-426.
15. A. Snavely, D.M. Tullsen, and G. Voelker, "Symbiotic Job Scheduling with Priorities for a Simultaneous Multithreading Processor," Proc. ACM SIGMETRICS Int'l Conf. Measurement and Modeling of Computer Systems, ACM Press, 2002, pp. 66-76.
16. R. Jain, C.J. Hughes, and S.V. Adve, "Soft Real-Time Scheduling on Simultaneous Multithreaded Processors," Proc. 23rd IEEE Int'l Real-Time Systems Symp, IEEE CS Press, 2002, pp. 134-145.
17. A. Fedorova, M. Seltzer, and M.D. Smith, "A Non-Work-Conserving Operating System Scheduler for SMT Processors," Proc. Workshop Interaction between Operating Systems and Computer Architecture (WIOSCA), 2006.
18. S. Eyerman and L. Eeckhout, "Probabilistic Job Symbiosis Modeling for SMT Processor Scheduling," to be published in Proc. Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), ACM Press, 2010.

Index Terms:
multicore, multithreaded architectures, system software, per-thread cycle accounting
Citation:
Stijn Eyerman, Lieven Eeckhout, "Per-Thread Cycle Accounting," IEEE Micro, vol. 30, no. 1, pp. 71-80, Jan.-Feb. 2010, doi:10.1109/MM.2010.23
Usage of this product signifies your acceptance of the Terms of Use.