The Community for Technology Leaders
Green Image
<p><b>Abstract</b>—We illustrate the potential of techniques and results from the theory of network emulations to enhance the performance of a parallel architecture. The vehicle for this demonstration is a suite of algorithms that endow an <tmath>$N$</tmath>-processor bit-serial processor array <tmath>${\cal A}$</tmath> with a “meta-instruction”<ss>GAUGE</ss><tmath>$k$</tmath>, which (logically) reconfigures <tmath>${\cal A}$</tmath> into an <tmath>$N/k$</tmath>-processor virtual machine <tmath>${\cal B}_k$</tmath> that has: 1) a datapath and memory bus whose emulated width is <tmath>$k$</tmath> bits, as opposed to <tmath>${\cal A}$</tmath>'s 1-bit width and 2) an instruction set that operates on <tmath>$k$</tmath>-bit words, in contrast to <tmath>${\cal A}$</tmath>'s instruction set, which operates on 1-bit words. In order to stress the strength of the approach, we show (via pseudocode) how our emulation techniques can be implemented efficiently even if <tmath>${\cal A}$</tmath> operates in strict SIMD mode, with only single-bit masking capabilities and with no indexed memory accesses. We describe at an algorithmic level how to implement our technique—including datapath conversion (“corner-turning”) and the creation of the word-parallel instruction sets—on arrays of any regular network topology. We instantiate our technique in detail for arrays based on topologies with quite disparate characteristics: the hypercube, the de Bruijn network, and a genre of mesh with reconfigurable buses. Importantly, the emulations that underlie our technique do not alter the native machine's instruction set, hence allowing an invariant programming model across gauges.</p>
Parallel architecture, multiprocessor interconnection, parallel algorithms.
Arnold L. Rosenberg, Martin C. Herbordt, Charles C. Weems, Bojana Obrenic, "Using Emulations to Enhance the Performance of Parallel Architectures", IEEE Transactions on Parallel & Distributed Systems, vol. 10, no. , pp. 1067-1081, October 1999, doi:10.1109/71.808155
311 ms
(Ver 3.3 (11022016))