This Article 
 Bibliographic References 
 Add to: 
Filtering Memory References to Increase Energy Efficiency
January 2000 (vol. 49 no. 1)
pp. 1-15

Abstract—Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. Caches typically are implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches can consume a significant amount of power. In many applications, such as portable devices, energy efficiency is more important than performance. We propose sacrificing some performance in exchange for energy efficiency by filtering cache references through an unusually small first level cache. We refer to this structure as the filter cache. A second level cache, similar in size and structure to a conventional first level cache, is positioned behind the filter cache and serves to mitigate the performance loss. Extensive experiments indicate that a small filter cache still can achieve a high hit rate and good performance. This approach allows the second level cache to be in a low power mode most of the time, thus resulting in power savings. The filter cache is particularly attractive in low power applications, such as the embedded processors used for communication and multimedia applications. For example, experimental results across a wide range of embedded applications show that a direct mapped 256-byte filter cache achieves a 58 percent power reduction while reducing performance by 21 percent. This trade-off results in a 51 percent reduction in the energy-delay product when compared to a conventional design.

[1] D.A. Patterson and J.L. Hennessy, Large and Fast: Exploiting Memory Hierarchy, Computer Organization&Design The Hardware/Software Interface. Morgan Kaufmann, 1994.
[2] J. Montanaro et al., “A 160MHz 32b 0.5W CMOS RISC Microprocessor,” Proc. Int'l Solid-State Circuits Conf., 1996.
[3] R. Bechade et al., “A 32b 66MHz 1.8W Microprocessor,” Proc. Int'l Solid-State Circuits Conf., 1994.
[4] E. Harris et al., “Technology Directions for Portable Computers,” Proc. IEEE, vol. 83, no. 4, Apr. 1995.
[5] R. Colwell and R. Steck, "A 0.6-μm BiCMOS Microprocessor with Dynamic Execution," Proc. Int'l Solid-State Circuits Conf., IEEE, Piscataway, N.J., 1995, pp. 176-177.
[6] R. Gonzalez and M. Horowitz, Energy Dissipation in General Purpose Microprocessors IEEE J. Solid-State Circuits, vol. 31, no. 9, Sept. 1996.
[7] N.H.E. Weste and K. Eshraghian, “Circuit Characterization and Performance Estimation,” Principles of CMOS VLSI Design, Addison Wesley, 1994.
[8] R. Sykes, “Texas Instruments Announces Chip Technology Breakthrough,” IDG News Service, 1997.
[9] A. Pfitzmann and M. Köhntopp, "Anonymity, Unobservability and Pseudonymity—A Proposal for Terminology," Designing Privacy Enhancing Technologies: Proc. Int'l Workshop Design Issues in Anonymity and Observability, LNCS, vol. 2009, Springer-Verlag, Berlin, 2000, pp. 1-9.
[10] K. Itoh, K. Sasaki, and Y. Nakagome, “Trends in Low-Power RAM Circuit Technologies,” Proc. IEEE, vol. 83, no. 4, pp. 524-543, Apr. 1995.
[11] J. Frenkil, “A Multi-Level Approach to Low-Power IC Design,” IEEE Spectrum, vol. 35, no. 2, 1998.
[12] M.J. Flynn, Computer Architecture Pipelined and Parallel Processor Design, Jones and Bartlett Publishers, Boston, 1995.
[13] M.B. Kamble and K. Ghose, “Energy-Efficiency of VLSI Caches: A Comparative Study,” Proc. Int'l Conf. VLSI Design, 1997.
[14] M.B. Kamble and K. Ghose,"Analytical Energy Dissipation Models for Low-Power Caches," Proc. Int'l Symp. Low Power Electronics and Design (ISPLED 97), ACM Press, 1997, pp. 143-148.
[15] S.E. Wilton and N. Jouppi, “An Enhanced Access and Cycle Time Model for On-Chip Caches,” DEC WRL, report 93/5, 1994.
[16] C. Su and A. Despain, “Cache Design Trade-Offs for Power and Performance Optimization: A Case Study,” Proc. Int'l Symp. Low Power Electronics and Design, pp. 63-68, 1995.
[17] U. Ko, P.T. Balsara, and A.K. Nanda, “Energy Optimization of Multi-Level Processor Cache Architectures,” Proc. Int'l Symp. Low Power Design, 1995.
[18] J. Turley, “ARM Grabs Embedded Speed Lead,” Microprocessor Report, 1996.
[19] P.P. Chang, S.A. Mahlke, W.Y. Chen, N.J. Warter, and W.W. Hwu, "IMPACT: An Architectural Framework for Multiple-Issue Processors," Proc. 18th Ann. Int'l Symp. Computer Architecture, pp. 276-275,Toronto, Ontario, Canada, May 1991.
[20] C. Lee, M. Potkonjak, and W.H. Mangione-Smith, MediaBench: A Tool For Evaluating and Synthesizing Multimedia and Communications Systems Proc. 30th Ann. IEEE/ACM Int'l Symp. Microarchitecture, pp. 330-335, 1997.
[21] B. Schneier, Applied Cryptography: Protocols, Algorithms and Source Code in C.New York: John Wiley&Sons, Inc., 1996.
[22] A. Watt, 3D Computer Graphics, second ed. Addison-Wesley, 1993.
[23] U. Ko, P.T. Balsara, and A.K. Nanda, "Energy Optimization of Multilevel Cache Architectures for RISC and CISC Processors," IEEE Trans. VLSI Systems, Vol. 6 No. 2, 1998, pp. 299-308.

Index Terms:
Filter cache, low power, embedded processor, energy-delay, media processor.
Johnson Kin, Munish Gupta, William H. Mangione-Smith, "Filtering Memory References to Increase Energy Efficiency," IEEE Transactions on Computers, vol. 49, no. 1, pp. 1-15, Jan. 2000, doi:10.1109/12.822560
Usage of this product signifies your acceptance of the Terms of Use.