In recent years, multimedia and game applications have experienced rapid growth at an explosive rate both in quantity and complexity. Since these applications typically demand 10^10 to 10^11 operations to be executed per second, higher processing capability is expected. Therefore, stream processors are becoming popular because of its performance advantages in the domains of signal processing, multimedia and etc. To provide sufficient computing capability, multi-SIMD units1 are employed in the stream processors. Moreover, to overcome the centralized register file constraint, hierarchical register organization is proposed and widely used in stream processors. In upper level of the hierarchy, distributed register file (DRF) becomes the dominant design and there are explicit interconnections among the DRFs managed by the compiler in a VLIW manner. Moreover, in order to further exploit the nice locality characteristics in multimedia applications, the lower level is a multi-banked register file where each bank is accessed by several SIMD units through a shared data bus. We will refer to the architecture with such characteristics as MLRMSIMD architecture.
Citation:
Weihua Zhang, Tao Bao, Binyu Zang, Chuanqi Zhu, "Optimizing Bandwidth Constraint through Register Interconnection for Stream Processors," pact, pp.434, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), 2007