This paper describes implementation details of a hardware compression and decompression unit (CDU) for optimizing energy consumption in processor-based systems. Many algorithms for data compression (i.e., profile-driven, adaptive, differential) have been introduced in [1, 2]. In all cases, data compression and decompression are performed on-the-fly on the cache-to-memory path: Uncompressed cache lines are compressed before they are written back to main memory, and decompressed when cache refills occur. This paper completes and extends the contributions of [1, 2] by providing evidence on the feasibility of the proposed compression architectures by specifically addressing hardware implementation issues. CDU design is targeted towards energy minimization in the cache- bus-memory subsystem with a strict constraint on performance. As a result, average memory energy reductions evaluated on several benchmark programs are around 24%, at no performance penalty.
Citation:
Luca Benini, Davide Bruni, Alberto Macii, Enrico Macii, "Hardw are Implementation of Data Compression Algorithms for Memory Energy Optimization," isvlsi, pp.250, IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03), 2003