International Symposium on Code Generation and Optimization (CGO'04) Using Dynamic Binary Translation to Fuse Dependent Instructions San Jose, California March 20-March 24 ISBN: 0-7695-2102-9
Instruction scheduling hardware can be simplified and easily pipelined if pairs of dependent instructions are fused so they share a single instruction scheduling slot. We study an implementation of the x86 ISA that dynamically translates x86 code to an underlying ISA that supports instruction fusing. A microarchitecture that is co-designed with the fused instruction set completes the implementation.In this paper, we focus on the dynamic binary translator for such a co-designed x86 virtual machine. The dynamic binary translator first cracks x86 instructions belonging to hot superblocks into RISC-style micro-operations, and then uses heuristics to fuse together pairs of dependent micro-operations. Experimental results with SPEC2000 integer benchmarks demonstrate that: (1) the fused ISA with dynamic binary translation reduces the number of scheduling decisions by about 30% versus a conventional implementation that uses hardware cracking into RISC micro-operations; (2) an instruction scheduling slot needs only hold two source register fields even though it may hold two instructions; (3) translations generated in the proposed ISA consume about 30% less storage than a corresponding fixed-length RISC-style ISA.
Citation:
Shiliang Hu, James E. Smith, "Using Dynamic Binary Translation to Fuse Dependent Instructions," cgo, pp.213, International Symposium on Code Generation and Optimization (CGO'04), 2004 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||