The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - June (2010 vol.59)
pp: 855-864
Yiqiang Ding , Southern Illinois University Carbondale, Cardondale
Wei Zhang , Southern Illinois University Carbondale, Carbondale
ABSTRACT
Estimating and optimizing worst-case execution time (WCET) is critical for hard real-time systems to ensure that different tasks can meet their respective deadlines. Recent work has shown that simple prefetching techniques such as the Next-N-Line prefetching can enhance both the average-case and worst-case performance; however, the improvement on the worst-case execution time is rather limited and inefficient. This paper studies a loop-based instruction prefetching approach, which can exploit the program control-flow information to intelligently prefetch instructions that are most likely needed. Our evaluation indicates that the loop-based instruction prefetching outperforms the Next-N-Line prefetching in both the worst-case and the average-case performance for real-time applications.
INDEX TERMS
Real-time and embedded systems, cache memories.
CITATION
Yiqiang Ding, Wei Zhang, "Loop-Based Instruction Prefetching to Reduce the Worst-Case Execution Time", IEEE Transactions on Computers, vol.59, no. 6, pp. 855-864, June 2010, doi:10.1109/TC.2010.44
REFERENCES
[1] R. Arnold, F. Muller, D. Whalley, and M. Harmon, "Bounding Worst-Case Instruction Cache Performance," Proc. 15th IEEE Real-Time Systems Symp., 1994.
[2] C.A. Healy, D.B. Whalley, and M.G. Harmon, "Integrating the Timing Analysis of Pipelining and Instruction Caching," Proc. 16th IEEE Real-Time Systems Symp., 1995.
[3] K.W. Batcher and R.A. Walker, "Interrupt Triggered Software Prefetching for Embedded CPU Instruction Cache," Proc. 12th IEEE Real-Time and Embedded Technology and Applications Symp. (RTAS), 2006.
[4] J. Yan and W. Zhang, "WCET Analysis of Instruction Caches with Prefetching," Proc. 2007 ACM SIGPLAN/SIGBED Conf. Languages, Compilers, and Tools for Embedded Systems (LCTES), 2007.
[5] A. Maynard, C. Donnelly, and B. Olszewski, "Contrasting Characteristics and Cache Performance of Technical and Multi-User Commercial Workloads," Proc. Sixth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-VI), 1994.
[6] A.J. Smith, "Sequential Program Prefetching in Memory Hierarchies," Computer, vol. 11, no. 12, pp. 7-21, Dec. 1978.
[7] A. Smith, "Cache Memories," ACM Computing Surveys, vol. 14, no. 3, pp. 473-530, Sept. 1982.
[8] R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. Ferdinand, R. Heckman, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and P. Stenstrom, "The Worst Case Execution Time Problem—Overview of Methods and Survey of Tools," Proc. ACM Trans. Embedded Computing Systems, Jan. 2007.
[9] J. Staschulat and R. Ernst, "Worst Case Timing Analysis of Input Dependent Data Cache Behavior," Proc. 18th Euromicro Conf. Real-Time Systems (ECRTS '06), 2006.
[10] D. Hardy and I. Puaut, "WCET Analysis of Multi-Level Non-Inclusive Set-Associative Instruction Caches," Proc. 29th IEEE Real-Time Systems Symp., 2008.
[11] X. Li, A. Roychoudhury, and T. Mitra, "Modeling Out-of-Order Processors for WCET Estimation," Real Time Systems, vol. 34, no. 3, pp. 195-227, Nov. 2006.
[12] J. Reineke, D. Grund, C. Berg, and R. Wilhelm, "Timing Predictability of Cache Replacement Policies," Real Time Systems, vol. 37, no. 2, pp. 99-122, 2007.
[13] J. Rosen, A. Andrei, P. Eles, and Z. Peng, "Bus Access Optimization for Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip," Proc. 28th IEEE Real-Time Systems Symp., 2007.
[14] J. Staschulat and R. Ernst, "Multiple Process Execution in Cache Related Preemption Delay Analysis," Proc. Fourth ACM Int'l Conf. Embedded Software, 2004.
[15] J. Yan and W. Zhang, "WCET Analysis for Multi-Core Processors with Shared l2 Instruction Caches," Proc. IEEE Real-Time and Embedded Technology and Applications Symp., (RTAS '08), 2008.
[16] C. Lee and J. Hahn, "Analysis of Cache-Related Preemption Delay in Fixed-Priority Preemptive Scheduling," Proc. 17th IEEE Real-Time Systems Symp., 1996.
[17] C. Lee, K. Lee, and J. Hahn, "Bounding Cache-Related Preemption Delay for Real-Time Systems," IEEE Trans. Software Eng., vol. 27, no. 9, pp. 805-826, Sept. 2001.
[18] H.S. Negi, T.M. Mitra, and A. Roychoudhury, "Accurate Estimation of Cache-Related Preemption Delay," Proc. First IEEE/ACM/IFIP Int'l Conf. Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2003.
[19] J.V. Busquets-Mataix et al., "Adding Instruction Cache Effect to Schedulability Analysis of Preemptive Real-Time Systems," Proc. Second IEEE Real-Time Technology and Applications Symp., 1996.
[20] C. Ferdinand and R. Wilhelm, "On Predicting Data Cache Behavior for Real-Time Systems," Proc. ACM SIGPLAN Workshop Languages, Compilers, and Tools for Embedded Systems, 1998.
[21] Y. Li and S. Malik, "Performance Analysis of Embedded Software Using Implicit Path Enumeration," Proc. ACM SIGPLAN Workshop Languages, Compilers, and Tools for Real-Time Systems, 1995.
[22] Y. Li, S. Malik, and A. Wolfe, "Efficient Microarchitecture Modeling and Path Analysis for Real-Time Software," Proc. 16th IEEE Real-Time Systems Symp., 1995.
[23] F. Mueller, "Generalizing Timing Predictions to Set-Associative Caches," Proc. Euromicro Workshop Real-Time Systems, 1997.
[24] H. Ramaprasad and F. Mueller, "Bounding Worst-Case Data Cache Behavior by Analytically Deriving Cache Reference Patterns," Proc. 11th IEEE Real-Time and Embedded Technology and Applications Symp., 2005.
[25] J. Smith and W.C. Hsu, "Prefetching in Supercomputer Instruction Caches," Proc. Supercomputing, 1992.
[26] J. Pierce and T. Mudge, "Wrong-Path Instruction Prefetching," Proc. 29th Ann. ACM/IEEE Int'l Symp. Microarchitecture (MICRO), Dec. 1996.
[27] D. Joseph and D. Grunwald, "Prefetching Using Markov Predictors," Proc. 24th Ann. Int'l Symp. Computer Architecture (ISCA), June 1997.
[28] C. Luk and T.C. Mowry, "Cooperative Prefetching: Compiler and Hardware Support for Effective Instruction Prefetching in Modern Processors," Proc. 31st Ann. ACM/IEEE Int'l Symp. Microarchitecture (MICRO), 1998.
[29] C. Xia and J. Torrellas, "Instruction Prefetching of Systems Codes with Layout Optimized for Reduced Cache Misses," Proc. 23rd Ann. Int'l Symp. Computer Architecture, 1996.
[30] G. Reinman, B. Calder, and T. Austin, "Fetch Directed Instruction Prefetching," Proc. 32nd Int'l Symp. Microarchitecture, Nov. 1999.
[31] V. Srinivasan, E.S. Davidson, G.S. Tyson, M.J. Charney, and T.R. Puzak, "Branch History Guided Instruction Prefetching," Proc. Seventh Int'l Conf. High Performance Computer Architecture (HPCA), Jan. 2001.
[32] P. Chow, P. Hammarlund, T. Aamodt, P. Marcuello, and H. Wang, "Hardware Support for Prescient Instruction Prefetch," Proc. Int'l Symp. High Performance Computer Architecture, 2004.
[33] C. Rochange and P. Sainrat, "Difficulties in Computing the WCET for Processors with Speculative Execution," Proc. Int'l Workshop Worst-Case Execution Time Analysis (WCET), 2002.
[34] V. Suhendra, T. Mitra, A. Roychoudhury, and T. Chen, "WCET Centric Data Allocation to Scratchpad Memory," Proc. 26th IEEE Real-Time Systems Symp., 2005.
[35] M. Langenbach, S. Thesing, and R. Heckmann, "Pipeline Modeling for Timing Analysis," Proc. Ninth Int'l Symp. Static Analysis (SAS), 2002.
[36] S. Thesing, "Safe and Precise Worst-Case Execution Time Prediction by Abstract Interpretation of Pipeline Models," PhD thesis, Saarland Univ., 2004.
[37] S.S. Lim, Y.H. Bae, G.T. Jang, B.D. Rhee, S.L. Min, C.Y. Park, H. Shin, K. Park, and C.S. Kim, "An Accurate Worst Case Timing Analysis Technique for RISC Processors," IEEE Trans. Software Eng., vol. 21, no. 7, pp. 593-604, July 1995.
[38] J.C. Liu and H.J. Lee, "Deterministic Upperbounds of the Worst-Case Execution Times of Cached Programs," Proc. 15th IEEE Real-Time Systems Symp., 1994.
[39] Y.S. Li, S. Malik, and A. Wolfe, "Cache Modeling for Real-Time Software: Beyond Direct Mapped Instruction Caches," Proc. 17th IEEE Real-Time Systems Symp., 1996.
[40] I. Puaut, "WCET-Centric Software-Controlled Instruction Caches for Hard Real-Time Systems," Proc. 18th Euromicro Conf. Real-Time Systems, July 2006.
[41] C. Berg, J. Engblom, and R. Wilhelm, "Requirements for and Design of a Processor with Predictable Timing," Proc. Dagstuhl Perspectives Workshop Design of Systems with Predictable Behavior, 2004.
[42] D. Simpson, "Real-time RISC," Proc. Int'l Conf. System Integration, pp. 35-38, July 1989.
[43] M. Lee, S.L. Min, and C.S. Kim, "A Worst Case Timing Analysis Technique for Instruction Prefetch Buffers," Proc. Microprocessing and Microprogramming, 1994.
[44] K. Chen, S. Malik, and D.I. August, "Retargetable Static Timing Analysis for Embedded Software," Proc. Int'l Symp. Systems Synthesis (ISSS), pp. 39-44, 2001.
[45] V. Kathail, M. Schlansker, and B.R. Rau, "HPL-PD Architecture Specification: Version 1.1," HPL technical report, 2000.
[46] Trimaran homepage, http:/www.trimaran.org, 2010.
[47] http://archi.snu.ac.kr/realtimebenchmark /, 2010.
[48] C. Lee, M. Potkonjak, and W.H. Mangione-Smith, "MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communication Systems," Proc. 30th Ann. ACM/IEEE Int'l Symp. Microarchitecture (MICRO), 1997.
[49] H. Kim, "Region-Based Register Allocation for EPIC Architecture," PhD thesis, New York Univ., 2001.
[50] S.S. Muchnick, "Advanced Compiler Design and Implementation," Morgan Kaufmann Publishers, 1997.
20 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool