Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (1998)
Oct. 12, 1998 to Oct. 18, 1998
Wayne Wolf , Princeton University
Zhao Wu , Princeton University
Programmable Video Signal Processors (VSPs) play an important role in multimedia applications due to their high performance and flexibility. In order to exploit the huge amount of parallelism inherent in the applications, VSPs employ aggressive parallel architectures, among which Very Long Instruction Word (VLIW) is becoming increasingly popular. For video signal processing, a carefully designed memory system is of particular importance, as video-rate applications have generated an unusual demand for high-bandwidth and low-latency memory access. Although many papers have addressed this issue for systems consisting of general-purpose microprocessors, few of them have considered environments containing VSPs, especially those based on VLIW VSPs. In this paper, we outline the problems involved in this specific context and compare five memory architectures for shared memory based VSPs. Our simulation results of six video applications show that combining caches with stream buffers in the proper way provides the highest performance.
VLIW, VSP, multi-cluster, memory system, shared memory, cache, stream buffer, stride prediction table, trace-driven simulation
Wayne Wolf, Zhao Wu, "Design Study of Shared Memory in VLIW Video Signal Processors", Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, vol. 00, no. , pp. 52, 1998, doi:10.1109/PACT.1998.727148