This Article 
 Bibliographic References 
 Add to: 
E-Kernel: An Embedding Kernel on the IBM Victor V256 Multiprocessor for Program Mapping and Network Reconfiguration
September 1994 (vol. 5 no. 9)
pp. 977-994

We present the design of E-kernel, an embedding kernel on the Victor V256message-passing partitionable multiprocessor, developed for the support of programmapping and network reconfiguration. E-kernel supports the embedding of a new network topology onto Victor's 2D mesh and also the embedding of a task graph onto the 2D mesh network or the reconfigured network. In the current implementation, the reconfigured network can be a line or an even-size ring, and the task graphs meshes or tori of a variety of dimensions and shapes or graphs with similar topologies. For application programs having these task graph topologies and that are designed according to the communication model of E-kernel, they can be run without any change on partitions connected by the 2D mesh, line, or ring. Further, E-kernel attempts the communication optimization of these programs on the different networks automatically, thus making both the network topology and the communication optimization attempt completely transparent to the application programs. Many of the embeddings used in E-kernel are optimal or asymptotically optimal (with respect to minimum dilation cost). Theimplementation of E-kernel translated some of the many theoretical results in graphembeddings into practical tools for program mapping and network reconfiguration in aparallel system. E-kernel is functional on Victor V256. Measurements of E-kernel'sperformance on V256 are also included.

[1] D. A. Bailey, J. E. Cuny, and C. P. Loomis, "ParaGraph: Graph editor support for parallel programming environments,"Int. J. Parallel Programming, vol. 19, pp. 75-110, 1990.
[2] V. Balasundaram et al., "A Static Performance Estimator to Guide Data Partitioning Decisions,"Proc. Third ACM SIG Plan Symp. Principles and Practice of Parallel Programming, ACM Press, New York, 1991, pp. 213-223.
[3] F. Berman, "Experience with an automatic solution to the mapping problem," in L. Jamieson, D. Gannon, and R. Douglas, Eds.,The Characteristics of Parallel Algorithms. Cambridge, MA: MIT Press, 1987, pp. 307-334.
[4] F. Berman and B. Stramm, "Prep-P: Evolution and overview," Tech. Rep. CS89-158, Dept. Comput. Sci. Eng., Univ. of California, San Diego, 1989.
[5] S. Bettayeb, I. H. Sudborough, and Z. Miller, "Embedding grids into hypercubes," inProc. 3rd Aegean Workshop Comput., 1988.
[6] S. H. Bokhari, "On the mapping problem,"IEEE Trans. Comput., vol. C-30, pp. 207-214, Mar. 1981.
[7] S. H. Bokhari, "A shortest tree algorithm for optimal assignments across space and time in a distributed processor system,"IEEE Trans. Software Eng., vol. SE-7, pp. 583-589, Nov. 1981.
[8] S. W. Bollinger and S. F. Midkiff, "Heuristic technique for processor and link assignment in multicomputers,"IEEE Trans. Comput., vol. 40, pp. 325-333, Mar. 1991.
[9] M. Y. Chan and F. Y. L. Chin, "On embedding rectangular grids in hypercubes,"IEEE Trans. Comput., vol. 37, pp. 1285-1288, Oct. 1988.
[10] J. A. Ellis, "Embedding rectangular grids into square grids,"IEEE Trans. Comput., vol. 40, pp. 46-52, Jan. 1991.
[11] D. Fernández-Baca, "Allocating modules to processors in a distributed system,"IEEE Trans. Software Eng., vol. 15, pp. 1427-1436, Nov. 1989.
[12] M. Foxet al., Solving Problems on Concurrent Processors, vol. 1. Englewood Cliffs, NJ: Prentice-Hall, 1988.
[13] C.-T. Ho and S. L. Johnsson, "On the embedding of arbitrary meshes in boolean cubes with expansion two dilation two," inProc. Int. Conf. Parallel Processing, 1987, pp. 188-191.
[14] C.A.R. Hoare,Communicating Sequential Processes, Prentice Hall, Englewood, N.J., 1985.
[15] J.-W. Hong, K. Mehlhorn, and A. Rosenberg, "Cost trade-offs in graph embeddings, with applications,"J. ACM, pp. 709-728, 1983.
[16] Inmos,The Transputer Databook. Trowbridge, UK: Redwood Burn Ltd., 1989.
[17] G. Jones,Programming in Occam, Prentice Hall, Englewood Cliffs, N.J., 1987.
[18] W. G. Rudd and T. G. Lewis, "Architecture of the Parallel Program Support Environment," inProc. CompCon '90, Feb. 1990, pp. 589-594.
[19] H. Li and M. Massimo, "Polymorphic-torus network,"IEEE Trans. Comput., vol. 38, pp. 1345-1351, Sept. 1989.
[20] W. Lin and C. L. Wu, "Reconfiguration procedures for a polynomial partitionable multiprocessor,"IEEE Trans. Comput., vol. C-35, pp. 910-916, Oct. 1986.
[21] V. M. Lo, "Heuristic algorithms for task assignment in distributed systems,"IEEE Trans. Comput., vol. 37, pp. 1384-1397, Nov. 1988.
[22] V. M. Lo, S. Rajopadhye, S. Gupta, D. Keldsen, M. A. Mohamed, B. Nitzberg, J. A. Telle, and X. Zhong, "OREGAMI: Tools for mapping parallel computations to parallel architectures,"Int. J. Parallel Programming, vol. 20, pp. 237-270, 1991.
[23] E. Ma and D. G. Shea, "The embedding kernel on the IBM Victor multiprocessor for program mapping and network reconfiguration," inProc. 2nd IEEE Symp. Parallel and Distributed Processing, 1990, pp. 874-879.
[24] E. Ma and D. G. Shea, "E-kernel--An embedding kernel on the IBM Victor multiprocessor for program mapping and network reconfiguration," IBM Res. Rep. RC 16771 (No. 74236), 1991.
[25] E. Ma and L. Tao, "Embeddings among meshes and tori,"J. Parallel Distrib. Comput., vol. 18, pp. 44-55, May 1993.
[26] D. M. Nicoi and D. R. O'Hallaron, "Improved algorithms for mapping pipelined and parallel computations,"IEEE Trans. Comput., vol. 40, pp. 295-306, Mar. 1991.
[27] D. G. Shea, "E-kernel on the IBM Victor V256 multiprocessor--an experimental platform for parallel systems," Ph.D. dissertation, Dept. Comput. Inform. Sci., Univ. of Pennsylvania, 1991.
[28] D. G. Shea, W. W. Wilcke, R. C. Booth, D. H. Brown, Z. D. Christidis, M. E. Giampapa, G. R. Irwin, T. T. Murakami, V. K. Naik, F. T. Tong, P. R. Varker, and D. J. Zukowski, "The IBM Victor V256 partitionable multiprocessor,"IBM J. Res. Dev., vol. 35, pp. 573-590, Sept./Nov. 1991.
[29] C.-C. Shen and W.-H. Tsai, "A graph matching approach to optimal task assignment in distributed computing systems using a minimax criterion,"IEEE Trans. Comput., vol. C-34, pp. 197-203, Mar. 1985.
[30] L. Snyder, "Introduction to the configurable, highly parallel computer,"Comput., vol. 15, pp. 47-56, Jan. 1982.
[31] L. Snyder, "Parallel programming and the Poker programming environment,"Comput., vol. 17, pp. 27-36, July 1984.
[32] L. Snyder, "The XYZ abstraction levels of Poker-like languages," in D. Gelernter, A. Nicolau, and D. Padua, Eds.,Languages and Compilers for Parallel Computing. Cambridge, MA: MIT Press, 1990, pp. 470-488.
[33] H. S. Stone, "Multiprocessor scheduling with the aid of network flow algorithms,"IEEE Trans. Software Eng., vol. 3, pp. 85-93, Jan. 1977.
[34] L. Tao, "Mapping parallel programs onto parallel systems with torus and mesh based communication structures," Tech. Rep. MS-CIS-88-59 (Ph.D. dissertation), Dept. Comput. Inform. Sci., Univ. of Pennsylvania, 1988.
[35] L. Tao and E. Ma, "Simulating parallel neighboring communications among square meshes and square toruses,"J. Supercomput., vol. 5, pp. 57-71, 1991.
[36] L. Tao, B. Narahari, and Y. C. Zhao, "Assigning task modules to processors in a distributed system,"J. Combinatorial Math. and Combinatorial Comput., to appear.
[37] L. Tao and Y. C. Zhao, "Multiway graph partition by stochastic probe," Comput. Operations Res., vol. 20, pp. 321-347, 1993.
[38] A. Wagner, S. Chanson, N. Goldstein, J. Jiang, H. Larsen, and H. Sreekantaswamy, "TIPS: Transputer-based interactive parallelizing system," inProc. Transputing'91: 1st World Transputer Conf., 1991, pp. 22-26.
[39] C. Whitby-Strevens, "The Transputer," inProc. 12th Annu. Symp. Comput. Architecture, Boston, MA, June 1985, pp. 292-300.
[40] A. Y. Wu, "Embedding of tree networks into hypercubes,"J. Parallel Distrib. Comput., vol. 2, pp. 238-249, 1985.
[41] T. Yang and A. Gerasoulis, "A parallel programming tool for scheduling on distributed memory multiprocessors," inProc. Scalable High Performance Comput. Conf. SHPCC-92, 1992, pp. 350-357.

Index Terms:
Index Termsmultiprocessing systems; message passing; real-time systems; graph theory; program diagnostics; parallel programming; E-kernel; embedding kernel; IBM victor V256 multiprocessor; program mapping; network reconfiguration; message-passing partitionable multiprocessor; new network topology; task graph; 2D mesh network; reconfigured network; task graph topologies; communication model; communication network topology; asymptotically optimal; parallel system
E. Ma, D.G. Shea, "E-Kernel: An Embedding Kernel on the IBM Victor V256 Multiprocessor for Program Mapping and Network Reconfiguration," IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 9, pp. 977-994, Sept. 1994, doi:10.1109/71.308535
Usage of this product signifies your acceptance of the Terms of Use.