This Article 
 Bibliographic References 
 Add to: 
Active Management of Data Caches by Exploiting Reuse Information
November 1999 (vol. 48 no. 11)
pp. 1244-1259

Abstract—As microprocessor speeds continue to outpace memory subsystems in speed, minimizing average data access time grows in importance. Multilateral caches afford an opportunity to reduce the average data access time by active management of block allocation and replacement decisions. We evaluate and compare the performance of traditional caches and multilateral caches with three active block allocation schemes: MAT, NTS, and PCS. We also compare the performance of NTS and PCS to multilateral caches with a near-optimal, but nonimplementable policy, pseudo-opt, that employs future knowledge to achieve both active allocation and active replacement. NTS and PCS are evaluated relative to pseudo-opt with respect to miss ratio, accuracy of predicting reference locality, actual usage accuracy, and tour lengths of blocks in the cache. Results show the multilateral schemes do outperform traditional cache management schemes, but fall short of pseudo-opt; increasing their prediction accuracy and incorporating active replacement decisions would allow them to more closely approach pseudo-opt performance.

[1] J.-L. Baer and T.-F. Chen, "An Effective On-Chip Preloading Scheme To Reduce Data Access Penalty," Proc. Supercomputing '91, pp. 176-186, 1991,.
[2] L.A. Belady, “A Study of Replacement Algorithms for a Virtual Storage Computer,” IBM Systems J., vol. 5, pp. 78-101, 1966.
[3] D. Burger and T.M. Austin, “Evaluating Future Microprocessors: The Simplescalar Tool Set,” Technical Report #1342, Univ. of Wisconsin-Madison, 1997.
[4] Callahan Kennedy and Porterfield, "Software Prefetching," Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 40-52, Apr. 1991.
[5] M.J. Charney and T.R. Puzak, “Prefetching and Memory System Behavior of the spec95 Benchmark Suite,” IBM J. Research and Development, vol. 41, no. 3, May 1997.
[6] T.F. Chen and J.L. Baer, “Reducing Memory Latency via Non-Blocking and Prefetching Caches,” Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), pp. 51-61, Oct. 1992.
[7] C.-H. Chi and H. Deitz, “Improving Cache Performance by Selective Cache Bypass,” Proc. 22nd Hawaii Int'l Conf. System Science, pp. 277-285, Jan. 1989.
[8] A. González, C. Aliagas, and M. Valero, A Data Cache With Multiple Caching Strategies Tuned to Different Types of Locality Proc. Int'l Conf. Supercomputing, pp. 338-347, July 1995.
[9] M.D. Hill, “DineroIII Documentation,” Univ. of California-Berkeley, unpublished UNIX-style man page, Oct. 1985.
[10] T.L. Johnson and W.W. Hwu, “Run-Time Adaptive Cache Hierarchy Management via Reference Analysis,” Proc. 24th Ann. Int'l Symp. Computer Architecture, pp. 315-326, June 1997.
[11] N.P. Jouppi, “Improving Direct-Mapped Cache Performance by the Addition of a Small Fully Associative Cache and Prefetch Buffers,” Proc. 17th Int'l Symp. Computer Architecture, pp. 364-373, May 1990.
[12] D. Kroft, "Lockup-Free Instruction Fetch/Prefetch Cache Organization," Proc. Eighth Int'l Symp. Computer Architecture, pp. 81-87, 1981.
[13] G. Kurpanchek, et al., “PA-7200: A PA-RISC Processor with Integrated High Performance MP Bus Interface,” COMPCON Digest of Papers, pp. 375-382, Feb. 1994.
[14] C.-C. Lee, I.-C.K. Chen, and T.N. Mudge, “The Bi-Mode Branch Predictor,” Proc. 30th Ann. Int'l Symp. Microarchitecture, pp. 4-13, Dec. 1997.
[15] V. Milutinovic, A. Milenkovic, and G. Shaeffer, "The Cache Injection Control Architecture: Initial Performance Analysis," Proc. Mascots '97: Fifth Int'l Symp. Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, IEEE Computer Soc. Press, Los Alamitos, Calif., 1997, pp. 63-64.
[16] J.A. Rivers and E.S. Davidson, "Reducing Conflicts in Direct-Mapped Caches with a Temporality Based Design," Proc. Int'l Conf. Parallel Processing, 1996.
[17] J.A. Rivers, E.S. Tam, and E.S. Davidson, “On Effective Data Supply for Multi-Issue Processors,” Proc. ICCD '97, pp. 519-528, Oct. 1997.
[18] J.A. Rivers, E.S. Tam, G.S. Tyson, E.S. Davidson, and M. Farrens, “Utilizing Reuse Information in Data Cache Management,” Proc. 12th ACM Int'l Conf. Supercomputing, July 1998.
[19] G.S. Sohi and M. Franklin, “High-Bandwidth Data Memory Systems for Superscalar Processors,” Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 53-62, 8-11 Apr. 1991.
[20] V. Srinivasan and E.S. Davidson, “Improving Performance of an L1 Cache with an Associated Buffer,” Technical Report CSE-TR-361-98, Univ. of Michigan-Ann Arbor, Mar. 1998.
[21] E.S. Tam and E.S. Davidson, “Early Design Cycle Timing Simulation of Caches,” Technical Report CSE-TR-317-96, Univ. of Michigan-Ann Arbor, Nov. 1996.
[22] E.S. Tam, J.A. Rivers, G.S. Tyson, and E.S. Davidson, “mlcache: A Flexible Multilateral Cache Simulator,” Proc. MASCOTS '98, pp. 19-26, 1998.
[23] G. Tyson et al., "A Modified Approach to Data Cache Management" Proc. 28th Int'l Symp. Microarchitecture, IEEE CS Press, 1995, pp. 93-103.

Index Terms:
Multilateral cache, active management, reuse information.
Edward S. Tam, Jude A. Rivers, Vijayalakshmi Srinivasan, Gary S. Tyson, Edward S. Davidson, "Active Management of Data Caches by Exploiting Reuse Information," IEEE Transactions on Computers, vol. 48, no. 11, pp. 1244-1259, Nov. 1999, doi:10.1109/12.811113
Usage of this product signifies your acceptance of the Terms of Use.