34th Annual Simulation Symposium (SS01)
PPIM-SIM: An Efficient Simulator for a Parallel Processor in Memory
Seattle, WA
April 22-April 26
ISBN: 0-7695-1092-2
Abstract: The gap between the speed of logic and DRAM access is widening. Traditional processors hide some of the mismatch in latency using techniques such as multi-level caches, instruction prefetching and memory interleaving/pipelining. Even with larger caches, cache miss rates are higher than the rate at which memory can provide data. Moreover, the memory bandwidth visible at the system bus forms a bottle-neck. Therefore, there are compelling reasons for integrating DRAM and logic including: (i) the bandwidth available within the chip is many orders of magnitude higher than that at the memory bus at a significantly lower access time and with lower power dissipation; and (ii) as typical work-loads shift towards data-intensive/multimedia applications, the wide bandwidth can be effectively utilized. To effectively support data-intensive applications, we designed a Parallel Processor in Memory (PPIM) processor. PPIM is based on a distributed data-parallel architecture with limited support for control parallelism. The paper presents ppim-sim, a cycle-accurate simulator that models PPIM processor in software and capable of running PPIM program binaries. Experiments conducted to evaluate the simulator, using a number of data-intensive application models for varying PPIM configurations are presented. It was observed from the experiments that ppim-sim not only simulates large models in tractable amounts of time, but also is memory efficient. In addition, the parameterized design of ppim-sim coupled with robust and effective interfaces makes it a research tool to study different processing element and controller architectures implemented in memory.
Citation:
Krishna Kumar Rangan, Philip A. Wilsey, Nilesh Pisolkar, Nael B. Abu-Ghazaleh, "PPIM-SIM: An Efficient Simulator for a Parallel Processor in Memory," ss, pp.0117, 34th Annual Simulation Symposium (SS01), 2001