This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Highly OR-Parallel Inference Machine (Multi-ASCA) and its Performance Evaluation: An Architecture and its Load Balancing Algorithms
September 1994 (vol. 43 no. 9)
pp. 1062-1075

An architecture and its four load balancing algorithms for a highly OR-parallel inference machine are proposed, and its performance is evaluated in a trace-driven simulation study. This inference machine consists of a large number of processing elements (PEs) with serial I/O links directly connected to each other in a simply modified mesh network. Each PE is a high-speed sequential Prolog processor with its own local memory. The activity of all PEs is locally controlled by four new load balancing algorithms based on purely local communication. Communication is allowed only between directly connected PEs. These load balancing algorithms reduce communication overhead in a load balancing and make it possible to accomplish highly OR-parallel execution. A software simulator using a trace-driven simulation technique based on an inference tree has been developed, and some typical OR-parallel benchmarks such as the n-queens problem have been simulated on it. The average communication per load balancing is reduced by a factor ranging from 1/30 to 1/100 by the interaction of these load balancing algorithms as compared with a conventional copying method. The inference machine (1024 PEs; 32/spl times/32 array) attains 300-600 times parallel speedup, assuming 1 MLIPS (mega logical inferences per second) PE and a 20 MBPS (mega bits per second) each serial I/O link, which could be easily integrated on a single chip using current VLSI technology. This highly OR-parallel inference machine promises to be an important step towards the realization of a high-performance artificial intelligence system.

[1] R. Kowalski,Logic for Problem Solving. New York: North-Holland, 1979.
[2] W. F. Clocksin and C. S. Mellish,Programming in Prolog. New York: Springer-Verlag, 1984.
[3] L. Sterling and E. Y. Shapiro,The Art of Prolog Programming. Cambridge, MA: The MIT Press, 1986.
[4] E. Tick and D. H. D. Warren, "Towards a Pipelined Prolog Processor," inProc. 1984 Int. Symp. Logic Programming, Feb. 1984, pp. 29-40.
[5] T. P. Dobryet al., "Performance studies of a Prolog machine architecture," inProc. 12th Annu. Symp. Comput. Architecture, June 1985, pp. 180-190.
[6] H. Nakashima, Y. Takeda, K. Nakajima, H. Andou, and K. Furutani, "A pipelined microprocessor for logic programming languages," inProc. 1990 IEEE Int. Conf. Comput. Design, Oct. 1990, pp. 355-359.
[7] D. H. D. Warren, "An abstract prolog instruction set," Tech. Note 309, Artificial Intell. Center, SRI Int., Oct, 1983.
[8] J. S. Conery and D. F. Kibler, "Parallel interpretation of logic programs," inConf. Functional Program. Lung. and Compu' Architecture, Portsmouth, NH, 1981, pp. 163-170.
[9] A. Ciepielewski, "Scheduling in OR-parallel prolog system: Survey and open problems,"International Journal Parallel Programming. New York: Plenum Press, vol. 20, no. 6, Dec. 1991, pp. 421-451.
[10] E. Y. Shapiro, Ed.,Concurrent Prolog--Collected Papers. Cambridge, MA: The MIT Press, 1987.
[11] S. Uchida, "Summary of the parallel inference machine and its basic software," inProc. Int. Conf. FGCS'92, June 1992, pp. 33-49.
[12] J. Crammond, "A comparative study of unification algorithms for OR-parallel execution of logic languages,"IEEE Trans. Comput., vol. C-34, no. 10, pp. 911-917, Oct. 1985.
[13] E. Lusk, R. Butler, T. Diaz, R. Olson, R. Overbeek, R. Stevens, D. H. D. Warren, A. Calderwood, P. Szeredi, S. Haridi, P. Brand, M. Carlsson, A. Ciepielewski, and B. Hausman, "The AURORA OR-parallel prolog system," inProc. Int. Conf. FGCS'88, Nov. 1988, pp. 819-830.
[14] A. Calderwood and P. Szeredi, "Scheduling OR-parallelism in Aurora--The Manchester scheduler," inProc. 1989 Int. Conf. Logic Programming, June 1989, pp. 419-435.
[15] U. Baron, J. Chassin de Kergommeaux, M. Hailperin, M. Ratcliffe, P. Robert, J. C. Syre, and H. Westphal, "The parallel ECRC Prolog system PEPSys: An overview and evaluation results," inProc. Int. Conf. FGCS'88, Nov. 1988, pp. 841-850.
[16] H. Westphal, P. Robert, J. Chassin, and J. C. Syre, "The PEPSys model: Combining backtracking, AND- and OR-Parallelism," inProc. 1987 Int. Symp. Logic Programming, Aug. 1987, pp. 436-448.
[17] D. H. D. Warren, "The SRI-model for OR-parallel execution of Prolog--Abstract design and implementation," inProc. 1987 Int. Symp. Logic Programming, Aug. 1987, pp. 92-101.
[18] P. Borgwart, "Parallel Prolog using stack segments on shared-memory multiprocessor," inProc. 1984 Int. Symp. Logic Programming, Feb. 1984, pp. 2-11.
[19] Y. Sohma, K. Satoh, K. Kumon, H. Masuzawa, and A. Itashiki,A New Parallel Inference Mechanism Based on Sequential Processing, J. V. Woods, Ed. Amsterdam: Elsevier Science Publishers B. V. North-Holland, 1986, pp. 3-14.
[20] K. A. M Ali, "OR-parallel execution of Prolog on BC-machine," inProc. 1988 Int. Conf. Logic Programming, July 1988, pp. 1531-1545.
[21] K. A. M. Ali and R. Karlsson, "The Muse approach to OR-parallel Prolog,"International Journal on Parallel Programming. New York: Plenum Press, vol. 19, no. 2, Apr. 1990, pp. 129-162.
[22] K. A. M. Ali and R. Karlsson "Scheduling OR-parallelism in Muse," inProc. 1991 Int. Conf. Logic Programming, June 1991, pp. 807-821.
[23] W. F. Clocksin, "Principles of the Delphi parallel inference machine,"Comput. J., vol. 30, no. 5, pp. 386-392, 1987.
[24] H. Alshawi and D. B. Moran, "The Delphi model and some preliminary experiments," inProc. 1988 Int. Conf. Logic Programming, July 1988, pp. 1578-1589.
[25] E. Y. Shapiro, "An OR-parallel execution algorithm for Prolog and its FCP implementation," inProc. 1987 Int. Conf. Logic Programming, May 1987, pp. 311-337.
[26] H. Tamaki, "Parallel tree search on hypercubes," inDig. Pap. 4th Software Science Japan Tech. Meeting, 1988, pp. 287-290.
[27] M. Kai, K. Kobayashi, and H. Kasahara, "An OR Parallel Processing Scheme of PROLOG Using Hierarchical Pincers Attack Search," (in Japanese),Trans. Inform. Processing Soc., Japan, vol. 29, no. 7, pp. 647-655, July 1988.
[28] T. Kawaguchi and M. Nakamura, "An OR parallel execution scheme of Prolog on loosely coupled multiprocessor systems," (in Japanese),Trans. IEICE, Japan, vol. J75-D-I, no. 6, pp. 380-384, June 1992.
[29] K. A. M. Ali and R. Karlsson, "OR-parallel speedups in a knowledge based systems: on Muse and Aurora," inProc. Int. Conf. FGCS'92, June 1992, pp. 739-745.
[30] J. Naganuma, T. Ogura, and T. Kimura, "A high-speed CAM based parallel inference machine architecture," (in Japanese), inProc. 34th Annu. Conv. IPS, Japan, Mar. 1987, pp. 201-202.
[31] J. Naganuma, T. Ogura, and T. Kimura, "Studies of a highly parallel inference machine based on ASCA," (in Japanese), inProc. 35th Annu. Conv. IPS, Japan, Sept. 1987, pp. 137-138.
[32] J. Naganuma and T. Ogura, "An architecture and its load balancing algorithms for a highly OR-parallel inference machine," (in Japanese), inProc. ICOT PIM-WG Workshop'89, Oct. 1989. Also in Tech. Memo. TM-835, ICOT Japan, Dec. 1989.
[33] J. Naganuma, T. Ogura, S. Yamada, and T. Kimura, "High-speed CAM based architecture for a Prolog machine (ASCA),"IEEE Trans. Comput.vol. 37, no. 11, pp. 1375-1383, Nov. 1988.
[34] J. Naganuma and T. Ogura, "CAM-based Prolog machine and its performance evaluation," (in Japanese),Trans. IEICE Japan, vol. J73- D-I, no. 11, pp. 856-863, Nov. 1990.
[35] J. Naganuma and T. Ogura, "An associative processor for logic programming languages," inProc. 14th Annu. Hawaii Int. Conf. Syst. Sci., Jan. 1991, pp. 229-236.
[36] P. S. Cheng, "Trace-driven system modeling,"IBM Syst. J., vol. 8, no. 4, pp. 280-289, 1969.
[37] D. May, R. Shepherd, and P. Thompson, "The T9000 transputer," inProc. 1992 IEEE Int. Conf. Comput. Design, Oct. 1992, pp. 209-212.
[38] T. Oguraet al., "A 4-bit associative memory LSI,"IEEE J. Solid-State Circuits, vol. SC-20, pp. 1277-1282, Dec. 1985.
[39] K. A. M. Ali, "OR-parallel execution of Prolog on a multisequential machine,"International Journal on Parallel Programming. New York: Plenum Press, vol. 15, no. 3, June 1987, pp. 189-214.
[40] M. Carlsson and J. Widen, "SICStus Prolog user's manual (version 0.7)," Swedish Inst. of Comput. Sci., July 1990.
[41] B. W. Kernighan and D. M. Ritchie,The C Programming Language. Englewood Cliffs, NJ: Prentice-Hall, 1978.
[42] K. C. Saraswat and F. Mohammadi, "Effect of scaling of interconnections on the time delay of VLSI circuits,"IEEE Trans. Electron Dev., vol. ED-29, no. 4, pp. 645-650, Apr. 1982.

Index Terms:
inference mechanisms; parallel machines; parallel architectures; PROLOG; performance evaluation; virtual machines; resource allocation; highly OR-parallel inference machine; Multi-ASCA; performance evaluation; parallel architecture; load balancing algorithms; trace-driven simulation; processing elements; serial I/O links; modified mesh network; high-speed sequential Prolog processor; local memory; locally controlled activity; local communication; communication overhead; software simulator; inference tree; OR-parallel benchmarks; n-queens problem; copying method; VLSI; high-performance artificial intelligence system; nonshared memory multiprocessor system; 20 Mbit/s.
Citation:
J. Naganuma, T. Ogura, "A Highly OR-Parallel Inference Machine (Multi-ASCA) and its Performance Evaluation: An Architecture and its Load Balancing Algorithms," IEEE Transactions on Computers, vol. 43, no. 9, pp. 1062-1075, Sept. 1994, doi:10.1109/12.312115
Usage of this product signifies your acceptance of the Terms of Use.