The Community for Technology Leaders
Green Image
<p>An architecture and its four load balancing algorithms for a highly OR-parallel inference machine are proposed, and its performance is evaluated in a trace-driven simulation study. This inference machine consists of a large number of processing elements (PEs) with serial I/O links directly connected to each other in a simply modified mesh network. Each PE is a high-speed sequential Prolog processor with its own local memory. The activity of all PEs is locally controlled by four new load balancing algorithms based on purely local communication. Communication is allowed only between directly connected PEs. These load balancing algorithms reduce communication overhead in a load balancing and make it possible to accomplish highly OR-parallel execution. A software simulator using a trace-driven simulation technique based on an inference tree has been developed, and some typical OR-parallel benchmarks such as the n-queens problem have been simulated on it. The average communication per load balancing is reduced by a factor ranging from 1/30 to 1/100 by the interaction of these load balancing algorithms as compared with a conventional copying method. The inference machine (1024 PEs; 32/spl times/32 array) attains 300-600 times parallel speedup, assuming 1 MLIPS (mega logical inferences per second) PE and a 20 MBPS (mega bits per second) each serial I/O link, which could be easily integrated on a single chip using current VLSI technology. This highly OR-parallel inference machine promises to be an important step towards the realization of a high-performance artificial intelligence system.</p>
inference mechanisms; parallel machines; parallel architectures; PROLOG; performance evaluation; virtual machines; resource allocation; highly OR-parallel inference machine; Multi-ASCA; performance evaluation; parallel architecture; load balancing algorithms; trace-driven simulation; processing elements; serial I/O links; modified mesh network; high-speed sequential Prolog processor; local memory; locally controlled activity; local communication; communication overhead; software simulator; inference tree; OR-parallel benchmarks; n-queens problem; copying method; VLSI; high-performance artificial intelligence system; nonshared memory multiprocessor system; 20 Mbit/s.

T. Ogura and J. Naganuma, "A Highly OR-Parallel Inference Machine (Multi-ASCA) and its Performance Evaluation: An Architecture and its Load Balancing Algorithms," in IEEE Transactions on Computers, vol. 43, no. , pp. 1062-1075, 1994.
84 ms
(Ver 3.3 (11022016))