This Article 
 Bibliographic References 
 Add to: 
Asynchronous Problems on SIMD Parallel Computers
July 1995 (vol. 6 no. 7)
pp. 704-713

Abstract—One of the essential problems in parallel computing is: Can SIMD machines handle asynchronous problems? This is a difficult, unsolved problem because of the mismatch between asynchronous problems and SIMD architectures. We propose a solution to let SIMD machines handle general asynchronous problems. Our approach is to implement a runtime support system which can run MIMD-like software on SIMD hardware. The runtime support system, named P kernel, is thread-based. There are two major advantages of the thread-based model. First, for application problems with irregular and/or unpredictable features, automatic scheduling can move some threads from overloaded processors to underloaded processors. Second, and more importantly, the granularity of threads can be controlled to reduce system overhead. The P kernel is also able to handle bookkeeping and message management, as well as to make these low-level tasks transparent to users. Substantial performance has been obtained on Maspar MP-1.

[1] G. Agha, ACTORS: A Model of Concurrent Computation in Distributed Systems, MIT Press, Cambridge, Mass., 1986.
[2] S. Ahuja, N. Carriero, and D. Gelernter, “Linda and Friends,” Computer, vol. 19, no. 8, Aug. 1986.
[3] J. Barnes and P. Hut,“A hierarchical O(NlogN) force calculation algorithm,” Nature, vol. 324, p. 446, 1986.
[4] G.E. Blelloch,Vector Models for Data-Parallel Computing. The MIT Press, 1990.
[5] N. Carriero and D. Gelernter, "Linda in Context," Comm. ACM, vol. 32, no. 4, Apr. 1989, pp. 444-458.
[6] M.J. Chung and Y. Chung,“Data parallel simulation using time-warp on the Connection Machine,” Proc. 26th ACM/IEEE Design Automation Conference, pp. 98-103, 1989.
[7] T.W. Clark,R.V. Hanxleden,K. Kennedy,C. Koelbel,, and L.R. Scott,“Evaluating parallel languages for molecular dynamics computations,” Scalable High Performance Computing Conference,Williamsburg, Va., Apr. 1992.
[8] R.J. Collins,“Multiple instruction multiple data emulation on the Connection Machine,” Technical Report CSD-910004, Dept. of Computer Science, Univ. of California, Feb. 1991.
[9] H.G. Dietz and W.E. Cohen,“A massively parallel MIMD implemented by SIMD hardware,” Technical Report TR-EE 92-4, School of Electrical Engineering, Purdue Univ., Feb. 1992.
[10] H.G. Dietz and G. Krishnamurthy,“Meta-state conversion,” Int’l Conf. on Parallel Processing, 1993.
[11] H.E. Rewini and T.G. Lewis,"Scheduling parallel program tasks onto arbitrary target machines," J. Parallel and Distributed Computing, vol. 9, pp. 138-153, 1990.
[12] G. Fox,M. Johnson,G. Lyzenga,S. Otto,J. Salmon,, and D. Walker,Solving Problems on Concurrent Processors, Vol. I: General Techniques andRegular Problems.Englewood Cliffs, N.J.: Prentice Hall 1988.
[13] G.C. Fox,“The architecture of problems and portable parallel software systems,” Technical Report SCCS-78b, Syracuse Univ., 1991.
[14] A. Gerasoulis and T. Yang,“A comparison of clustering heuristics for scheduling DAGs on multiprocessors,” J. of Parallel and Distributed Computing, Dec. 1992.
[15] J.P. Hayes et al., “A Microprocessor-Based Hypercube Supercomputer,” IEEE Micro, Vol. 6 No. 5 Oct. 1986, pp. 6–17.
[16] W.D. Hillis and G.L. Steele, "Data Parallel Algorithms," Comm. ACM, vol. 29, no. 12, pp. 1,170-1,183, Dec. 1986.
[17] P. Hudak and E. Mohr, "Graphinators and the Duality of SIMD and MIMD," Proc. 1988 ACM Symp. Lisp and Functional Programming, 1988.
[18] R.G. Babb II and D.C. DiNucci,“Design and implementation of parallel programs with large-grain data flow,” L.H. Jamieson, D.B. Gannon, and R.J. Douglass, eds., The Characteristics of Parallel Algorithms, pp. 335-349. MIT Press, 1987.
[19] P. Kacsuk and A. Bale, "DAP Prolog: A Set-Oriented Approach to Prolog," The Computer J., vol. 30, no. 5, pp. 393-403, 1987.
[20] S.A. Kravitz,R.E. Bryant,, and R.A. Rutenbar,“Logic simulation on massively parallel architectures,” Proc. 16th Ann. Int’l Symp. on Computer Architecture, pp. 336-343, 1989.
[21] B.C. Kuszmaul,“Simulating applicative architectures on the connection machine,” master's thesis, MIT, 1986.
[22] M.S. Littman and C.D. Metcalf,“An exploration of asynchronous data-parallelism,” Technical Report YALEU/DCS/TR-684, Dept. of Computer Science, Yale Univ., Oct. 1988.
[23] M. Nilsson and H. Tanaka,“A flat GHC implementation for supercomputers,” Proc. Fifth Int’l Conf. on Logic Programming, pp. 1,337-1,350, 1988.
[24] M. Nilsson and H. Tanaka,“Massively parallel implementation of flat GHC on the Connection Machine,” Proc. Int’l Conf. on Fifth Generation Computer Systems, pp. 1,031-1,039, 1988.
[25] M. Nilsson and H. Tanaka, "MIMD Execution by SIMD Computers," J. Information Processing, vol. 13, no. 1, pp. 58-61, 1988.
[26] J.L. Potter and W.C. Meilander, “Array Processor Supercomputers,” Proc. IEEE, Vol. 77, No. 12, 1989, pp. 1896-1914.
[27] J. Seizovic,“The reactive kernel,” Technical Report Caltech-CS-TR-99-10, Computer Science, California Inst. of Tech nology, 1988.
[28] J. Shen and J.A. McCammon,“Molecular dynamics simulation of superoxide interacting with superoxide dismutase,” Chemical Physics, vol. 158, pp. 191-198, 1991.
[29] W. Shu,“Chare Kernel and Its Implementation on Multicomputers.” PhD thesis, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign, Jan. 1990.
[30] W. Shu and L. V. Kal\' e,“Chare kernel—A runtime support system for parallel computations,”J. Parallel Distrib. Comput., vol. 11, pp. 198–211, 1991.
[31] W. Shu and M.Y. Wu,“Runtime incremental parallel scheduling on distributed memory computers,” Technical Report 94-25, Dept. of Computer Science, State Univ. of New York at Buffalo, June 1994.
[32] W. Shu and M.Y. Wu,“Solving dynamic and irregular problems on SIMD architectures with runtime support,” Int'l Conf. on Parallel Processing, pp. II. 167-174, Aug. 1993.
[33] A. Skjellum,A.P. Leung,, and M. Morari,“Zipcode: A portable multicomputer communication library atop the reactive kernel,” Proc. Fifth Distributed Memory Computing Conf., pp. 767-776, Apr. 1990.
[34] Thinking Machines Corp., CM Fortran Reference Manual, version 5.2-0.6 ed., Sept. 1989.
[35] Thinking Machines Corp., Introduction to Connection Machine Scientific Software Library (CMSSL), version 2.2 ed., Nov. 1991.
[36] S. Tomboulian and M. Pappas,“Indirect addressing and load balancing for faster solution to mandelbrot set on SIMD architectures,” Third Symp. on the Frontiers of Massively Parallel Computation, pp. 443-450, Oct. 1990.
[37] R.V. Hanxleden and K. Kennedy,“Relaxing SIMD control flow constraints using loop transformations,” Technical Report CRPC-TR92207, Center for Research on Parallel Computation, Rice Univ., Apr. 1992.
[38] W.F. van Gunsteren and H.J.C. Berendsen,“GROMOS: GROningen MOlecular Simulation software,” Technical report, Laboratory of Physical Chemistry, Univ. of Groningen, Nijenborgh, The Netherlands, 1988.
[39] M. Willebeek-LeMair and A.P. Reeves,“Solving nonuniform problems on SIMD computers: Case study on region growing,” J. of Parallel and Distributed Computing, vol. 8, no. 2, pp. 135-149, Feb. 1990.
[40] P. Wilsey and D. Hensgen,“Exploiting SIMD computers for general purpose computation,” Proc. Sixth Int’l Parallel Processing Symp., pp. 675-679, Mar. 1992.
[41] P. Wilsey,D. Hensgen,N. Abu-Ghazaleh,C. Slusher,, and D. Hollinden,“The concurrent execution of non-communicating programs on SIMD processors,” Fourth Symp. on the Frontiers of Massively Parallel Computation, Oct. 1992.
[42] M.Y. Wu and D.D. Gajski,"Hypertool: A programming aid for message-passing systems," IEEE Transactions on Parallel and Distributed Systems, vol. 1, no. 3, pp. 330-343, July 1990.

Index Terms:
SIMD parallel computers, portable programming environment, load balancing, thread model, scalability, irregular and dynamic applications.
Wei Shu, Min-You Wu, "Asynchronous Problems on SIMD Parallel Computers," IEEE Transactions on Parallel and Distributed Systems, vol. 6, no. 7, pp. 704-713, July 1995, doi:10.1109/71.395399
Usage of this product signifies your acceptance of the Terms of Use.