Parallel and Distributed Systems, International Conference on (1997)
Dec. 11, 1997 to Dec. 13, 1997
Traditionally, the performance of a stack machine was limited by the true data dependency. A performance enhancement mechanism - Stack Operations Folding - was used in Sun Microelectronics picoJava design  and it can reduce up to 60% of all stack operations. In this paper, we use the Java bytecode language as the target machine language, and study its instruction folding on a proposed machine model.Three folding strategies: 2-foldable, 3-foldable and 4-foldable, were simulated and evaluated. Statistical data show that our third folding strategy eliminates 73% of all stack operations, and each strategy has an overall program speedup of 1.19, 1.25 and 1.26, respectively, as compared to a traditional stack machine. Moreover, a Java machine model suitable for instruction folding, together with its pipeline stages, are presented. It seems to have the best cost/performance effectiveness of a Java stack machine if six bytes decoder width and the second folding strategy -- the three-foldable strategy -- are adopted.
S. Shang et al., "Instruction Folding in Java Processor," Parallel and Distributed Systems, International Conference on(ICPADS), Seoul, KOREA, 1997, pp. 138.