2006 International Conference on Parallel Architectures and Compilation Techniques (PACT) (2006)
Seattle, WA, USA
Sept. 16, 2006 to Sept. 20, 2006
DOI Bookmark: http://doi.ieeecomputersociety.org/
Oliverio J. Santana , Universidad de Las Palmas, de Gran Canaria
Ayose Falcon , Barcelona Research Office, HP Labs
Alex Ramirez , Universitat Politécnica de Catalunya and Barcelona Supercomputing Center
Mateo Valero , Universitat Politécnica de Catalunya and Barcelona Supercomputing Center
Fast instruction decoding is a challenge for the design of CISC microprocessors. A well-known solution to overcome this problem is using a trace cache. It stores and fetches already decoded instructions, avoiding the need for decoding them again. However, implementing a trace cache involves an important increase in the fetch architecture complexity. In this paper, we propose a novel decoding architecture that reduces the fetch engine implementation cost. Instead of using a special-purpose buffer like the trace cache, our proposal stores frequently decoded instructions in the memory hierarchy. The address where the decoded instructions are stored is kept in the branch prediction mechanism, enabling it to guide our decoding architecture. This makes it possible for the processor front-end to fetch already decoded instructions from memory instead of the original non-decoded instructions. Our results show that an 8-wide superscalar processor achieves an average 14% performance improvement by using our decoding architecture. This improvement is comparable to the one achieved by using the more complex trace cache, while requiring 16% less chip area and 21% less energy consumption in the fetch architecture.
complexity-effective, Instruction decoding, branch predictor
O. J. Santana, A. Falcon, A. Ramirez and M. Valero, "Branch predictor guided instruction decoding," 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT), Seattle, WA, USA, 2006, pp. 202-211.