|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Alex Ramirez, Josep L. Larriba-Pey, Mateo Valero, "Software Trace Cache," IEEE Transactions on Computers, vol. 54, no. 1, pp. 22-35, January, 2005. | |||
| BibTex | x | ||
| @article{ 10.1109/TC.2005.13, author = {Alex Ramirez and Josep L. Larriba-Pey and Mateo Valero}, title = {Software Trace Cache}, journal ={IEEE Transactions on Computers}, volume = {54}, number = {1}, issn = {0018-9340}, year = {2005}, pages = {22-35}, doi = {http://doi.ieeecomputersociety.org/10.1109/TC.2005.13}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - Software Trace Cache IS - 1 SN - 0018-9340 SP22 EP35 EPD - 22-35 A1 - Alex Ramirez, A1 - Josep L. Larriba-Pey, A1 - Mateo Valero, PY - 2005 KW - Pipeline processors KW - instruction fetch KW - compiler optimizations KW - branch prediction KW - trace cache. VL - 54 JA - IEEE Transactions on Computers ER - | |||
[1] J.M. Anderson, L.M. Berc, J. Dean, S. Ghemawat, M.R. Henzinger, S.-T.A. Leung, R.L. Sites, M.T. Vandevoorde, C.A. Waldspurger, and W.E. Weihl, “Continuous Profiling: Where Have All the Cycles Gone?” Technical Report 1997-16, Compaq Systems Research Lab., July 1997.
[2] T. Ball and J.R. Larus, “Efficient Path Profiling,” Proc. 29th Ann. ACM/IEEE Int'l Symp. Microarchitecture, Dec. 1996.
[3] L.A. Barroso, K. Gharachorloo, and E. Bugnion, “Memory System Characterization of Commercial Workloads,” Proc. 16th Ann. Int'l Symp. Computer Architecture, pp. 3-14, June 1998.
[4] B. Calder and D. Grunwald, “Reducing Branch Costs via Branch Alignment,” Proc. Sixth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 242-251, Oct. 1994.
[5] R. Cohn, D. Goodwin, P.G. Lowney, and N. Rubin, “Spike: An Optimizer for Alpha/NT Executables,” USENIX, pp. 17-23, Aug. 1997.
[6] T. Conte, K. Menezes, P. Mills, and B. Patell, “Optimization of Instruction Fetch Mechanism for High Issue Rates,” Proc. 22nd Ann. Int'l Symp. Computer Architecture, pp. 333-344, June 1995.
[7] J.A. Fisher, “Trace Scheduling: A Technique for Global Microcode Compaction,” IEEE Trans. Computers, vol. 30, no. 7, pp. 478-490, July 1981.
[8] J.A. Fisher and S.M. Freudenberger, “Predicting Conditional Branch Directions from Previous Runs of a Program,” Proc. Fifth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 85-95, 1992.
[9] D.H. Friendly, S.J. Patel, and Y.N. Patt, “Alternative Fetch and Issue Techniques from the Trace Cache Mechanism,” Proc. 30th Ann. ACM/IEEE Int'l Symp. Microarchitecture, Dec. 1997.
[10] N. Gloy, T. Blackwell, M.D. Smith, and B. Calder, “Procedure Placement Using Temporal Ordering Information,” Proc. 30th Ann. ACM/IEEE Int'l Symp. Microarchitecture, pp. 303-313, Dec. 1997.
[11] A.H. Hashemi, D.R. Kaeli, and B. Calder, “Efficient Procedure Mapping Using Cache Line Coloring,” Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, pp. 171-182, June 1997.
[12] D.L. Howard and M.H. Lipasti, “The Effect of Program Optimization on Trace Cache Performance,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 256-261, Oct. 1999.
[13] W.-M. Hwu and P.P. Chang, “Achieving High Instruction Cache Performance with an Optimizing Compiler,” Proc. 16th Ann. Int'l Symp. Computer Architecture, pp. 242-251, June 1989.
[14] J. Kalamatianos and D.R. Kaeli, “Temporal-Based Procedure Reordering for Improved Instruction Cache Performance,” Proc. Fourth Int'l Conf. High Performance Computer Architecture, Feb. 1998.
[15] C.-C. Lee, I-C.K. Chen, and T.N. Mudge, “The Bi-Mode Branch Predictor,” Proc. 30th Ann. ACM/IEEE Int'l Symp. Microarchitecture, pp. 4-13, Dec. 1997.
[16] S. McFarling, “Combining Branch Predictors,” Technical Report TN-36, Compaq Western Research Lab., June 1993.
[17] P. Michaud, A. Seznec, and R. Uhlig, “Trading Conflict and Capacity Aliasing in Conditional Branch Predictors,” Proc. 24th Ann. Int'l Symp. Computer Architecture, pp. 292-303, 1997.
[18] R. Muth, “Alto: A Platform for Object Code Modification,” PhD dissertation, Univ. of Arizona, Aug. 1999.
[19] S.J. Patel, D.H. Friendly, and Y.N. Patt, “Critical Issues Regarding the Trace Cache Fetch Mechanism,” Technical Report CSE-TR-335-97, Univ. of Michigan, May 1997.
[20] K. Pettis and R.C. Hansen, “Profile Guided Code Positioning,” Proc. ACM SIGPLAN Conf. Programming Language Design and Implementation, pp. 16-27, June 1990.
[21] A. Ramirez, L. Barroso, K. Gharachorloo, R. Cohn, J.L. Larriba-Pey, G. Lawney, and M. Valero, “Code Layout Optimizations for Transaction Processing Workloads,” Proc. 28th Ann. Int'l Symp. Computer Architecture, July 2001.
[22] A. Ramirez, J.L. Larriba-Pey, and M. Valero, “The Effect of Code Reordering on Branch Prediction,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques, pp. 189-198, Oct. 2000.
[23] A. Ramirez, J.L. Larriba-Pey, C. Navarro, X. Serrano, J. Torrellas, and M. Valero, “Optimization of Instruction Fetch for Decision Support Workloads,” Proc. Int'l Conf. Parallel Processing, pp. 238-245, Sept. 1999.
[24] A. Ramirez, J.L. Larriba-Pey, C. Navarro, J. Torrellas, and M. Valero, “Software Trace Cache,” Proc. 13th Int'l Conf. Supercomputing, June 1999.
[25] A. Ramirez, O.J. Santana, J.L. Larriba-Pey, and M. Valero, “Fetching Instruction Streams,” Proc. 35th Ann. ACM/IEEE Int'l Symp. Microarchitecture, 2002.
[26] M. Rosenblum, E. Bugnion, S.A. Herrod, and S. Devine, “Using the Simos Machine Simulator to Study Complex Computer Systems,” ACM Trans. Modeling and Computer Simulation, vol. 7, no. 1, pp. 78-103, Jan. 1997.
[27] E. Rotenberg, S. Benett, and J.E. Smith, “Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching,” Proc. 29th Ann. ACM/IEEE Int'l Symp. Microarchitecture, pp. 24-34, Dec. 1996.
[28] A. Seznec and P. Michaud, “D-Aliased Hybrid Branch Predictors,” Technical Report PI-1229, IRISA, Feb. 1999.
[29] J.E. Smith, “A Study of Branch Prediction Strategies,” Proc. Eighth Ann. Int'l Symp. Computer Architecture, pp. 135-148, 1981.
[30] E. Sprangle, R.S. Chappell, M. Alsup, and Y.N. Patt, “The Agree Predictor: A Mechanism for Reducing Negative Branch History Interference,” Proc. 24th Ann. Int'l Symp. Computer Architecture, pp. 284-291, 1997.
[31] A. Srivastava and D.W. Wall, “A Practical System for Intermodule Code Optimization at Link-Time,” J. Programming Languages, vol. 1, no. 1, pp. 1-18, Dec. 1992.
[32] J. Torrellas, C. Xia, and R. Daigle, “Optimizing Instruction Cache Performance for Operating System Intensive Workloads,” Proc. First Int'l Conf. High Performance Computer Architecture, pp. 360-369, Jan. 1995.
[33] T.-Y. Yeh, D.T. Marr, and Y.N. Patt, “Increasing the Instruction Fetch Rate via Multiple Branch Prediction and a Branch Address Cache,” Proc. Seventh Int'l Conf. Supercomputing, pp. 67-76, July 1993.
[34] T.-Y. Yeh and Y.N. Patt, “Alternative Implementations of Two-Level Adaptive Branch Prediction,” Proc. 19th Ann. Int'l Symp. Computer Architecture, pp. 124-134, 1992.
[35] T.-Y. Yeh and Y.N. Patt, “A Comparison of Dynamic Branch Predictors that Use Two Levels of Branch History,” Proc. 20th Ann. Int'l Symp. Computer Architecture, pp. 257-266, 1993.

