
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
E. Ma, D.G. Shea, "EKernel: An Embedding Kernel on the IBM Victor V256 Multiprocessor for Program Mapping and Network Reconfiguration," IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 9, pp. 977994, September, 1994.  
BibTex  x  
@article{ 10.1109/71.308535, author = {E. Ma and D.G. Shea}, title = {EKernel: An Embedding Kernel on the IBM Victor V256 Multiprocessor for Program Mapping and Network Reconfiguration}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {5}, number = {9}, issn = {10459219}, year = {1994}, pages = {977994}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.308535}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  EKernel: An Embedding Kernel on the IBM Victor V256 Multiprocessor for Program Mapping and Network Reconfiguration IS  9 SN  10459219 SP977 EP994 EPD  977994 A1  E. Ma, A1  D.G. Shea, PY  1994 KW  Index Termsmultiprocessing systems; message passing; realtime systems; graph theory; program diagnostics; parallel programming; Ekernel; embedding kernel; IBM victor V256 multiprocessor; program mapping; network reconfiguration; messagepassing partitionable multiprocessor; new network topology; task graph; 2D mesh network; reconfigured network; task graph topologies; communication model; communication network topology; asymptotically optimal; parallel system VL  5 JA  IEEE Transactions on Parallel and Distributed Systems ER   
We present the design of Ekernel, an embedding kernel on the Victor V256messagepassing partitionable multiprocessor, developed for the support of programmapping and network reconfiguration. Ekernel supports the embedding of a new network topology onto Victor's 2D mesh and also the embedding of a task graph onto the 2D mesh network or the reconfigured network. In the current implementation, the reconfigured network can be a line or an evensize ring, and the task graphs meshes or tori of a variety of dimensions and shapes or graphs with similar topologies. For application programs having these task graph topologies and that are designed according to the communication model of Ekernel, they can be run without any change on partitions connected by the 2D mesh, line, or ring. Further, Ekernel attempts the communication optimization of these programs on the different networks automatically, thus making both the network topology and the communication optimization attempt completely transparent to the application programs. Many of the embeddings used in Ekernel are optimal or asymptotically optimal (with respect to minimum dilation cost). Theimplementation of Ekernel translated some of the many theoretical results in graphembeddings into practical tools for program mapping and network reconfiguration in aparallel system. Ekernel is functional on Victor V256. Measurements of Ekernel'sperformance on V256 are also included.
[1] D. A. Bailey, J. E. Cuny, and C. P. Loomis, "ParaGraph: Graph editor support for parallel programming environments,"Int. J. Parallel Programming, vol. 19, pp. 75110, 1990.
[2] V. Balasundaram et al., "A Static Performance Estimator to Guide Data Partitioning Decisions,"Proc. Third ACM SIG Plan Symp. Principles and Practice of Parallel Programming, ACM Press, New York, 1991, pp. 213223.
[3] F. Berman, "Experience with an automatic solution to the mapping problem," in L. Jamieson, D. Gannon, and R. Douglas, Eds.,The Characteristics of Parallel Algorithms. Cambridge, MA: MIT Press, 1987, pp. 307334.
[4] F. Berman and B. Stramm, "PrepP: Evolution and overview," Tech. Rep. CS89158, Dept. Comput. Sci. Eng., Univ. of California, San Diego, 1989.
[5] S. Bettayeb, I. H. Sudborough, and Z. Miller, "Embedding grids into hypercubes," inProc. 3rd Aegean Workshop Comput., 1988.
[6] S. H. Bokhari, "On the mapping problem,"IEEE Trans. Comput., vol. C30, pp. 207214, Mar. 1981.
[7] S. H. Bokhari, "A shortest tree algorithm for optimal assignments across space and time in a distributed processor system,"IEEE Trans. Software Eng., vol. SE7, pp. 583589, Nov. 1981.
[8] S. W. Bollinger and S. F. Midkiff, "Heuristic technique for processor and link assignment in multicomputers,"IEEE Trans. Comput., vol. 40, pp. 325333, Mar. 1991.
[9] M. Y. Chan and F. Y. L. Chin, "On embedding rectangular grids in hypercubes,"IEEE Trans. Comput., vol. 37, pp. 12851288, Oct. 1988.
[10] J. A. Ellis, "Embedding rectangular grids into square grids,"IEEE Trans. Comput., vol. 40, pp. 4652, Jan. 1991.
[11] D. FernándezBaca, "Allocating modules to processors in a distributed system,"IEEE Trans. Software Eng., vol. 15, pp. 14271436, Nov. 1989.
[12] M. Foxet al., Solving Problems on Concurrent Processors, vol. 1. Englewood Cliffs, NJ: PrenticeHall, 1988.
[13] C.T. Ho and S. L. Johnsson, "On the embedding of arbitrary meshes in boolean cubes with expansion two dilation two," inProc. Int. Conf. Parallel Processing, 1987, pp. 188191.
[14] C.A.R. Hoare,Communicating Sequential Processes, Prentice Hall, Englewood, N.J., 1985.
[15] J.W. Hong, K. Mehlhorn, and A. Rosenberg, "Cost tradeoffs in graph embeddings, with applications,"J. ACM, pp. 709728, 1983.
[16] Inmos,The Transputer Databook. Trowbridge, UK: Redwood Burn Ltd., 1989.
[17] G. Jones,Programming in Occam, Prentice Hall, Englewood Cliffs, N.J., 1987.
[18] W. G. Rudd and T. G. Lewis, "Architecture of the Parallel Program Support Environment," inProc. CompCon '90, Feb. 1990, pp. 589594.
[19] H. Li and M. Massimo, "Polymorphictorus network,"IEEE Trans. Comput., vol. 38, pp. 13451351, Sept. 1989.
[20] W. Lin and C. L. Wu, "Reconfiguration procedures for a polynomial partitionable multiprocessor,"IEEE Trans. Comput., vol. C35, pp. 910916, Oct. 1986.
[21] V. M. Lo, "Heuristic algorithms for task assignment in distributed systems,"IEEE Trans. Comput., vol. 37, pp. 13841397, Nov. 1988.
[22] V. M. Lo, S. Rajopadhye, S. Gupta, D. Keldsen, M. A. Mohamed, B. Nitzberg, J. A. Telle, and X. Zhong, "OREGAMI: Tools for mapping parallel computations to parallel architectures,"Int. J. Parallel Programming, vol. 20, pp. 237270, 1991.
[23] E. Ma and D. G. Shea, "The embedding kernel on the IBM Victor multiprocessor for program mapping and network reconfiguration," inProc. 2nd IEEE Symp. Parallel and Distributed Processing, 1990, pp. 874879.
[24] E. Ma and D. G. Shea, "EkernelAn embedding kernel on the IBM Victor multiprocessor for program mapping and network reconfiguration," IBM Res. Rep. RC 16771 (No. 74236), 1991.
[25] E. Ma and L. Tao, "Embeddings among meshes and tori,"J. Parallel Distrib. Comput., vol. 18, pp. 4455, May 1993.
[26] D. M. Nicoi and D. R. O'Hallaron, "Improved algorithms for mapping pipelined and parallel computations,"IEEE Trans. Comput., vol. 40, pp. 295306, Mar. 1991.
[27] D. G. Shea, "Ekernel on the IBM Victor V256 multiprocessoran experimental platform for parallel systems," Ph.D. dissertation, Dept. Comput. Inform. Sci., Univ. of Pennsylvania, 1991.
[28] D. G. Shea, W. W. Wilcke, R. C. Booth, D. H. Brown, Z. D. Christidis, M. E. Giampapa, G. R. Irwin, T. T. Murakami, V. K. Naik, F. T. Tong, P. R. Varker, and D. J. Zukowski, "The IBM Victor V256 partitionable multiprocessor,"IBM J. Res. Dev., vol. 35, pp. 573590, Sept./Nov. 1991.
[29] C.C. Shen and W.H. Tsai, "A graph matching approach to optimal task assignment in distributed computing systems using a minimax criterion,"IEEE Trans. Comput., vol. C34, pp. 197203, Mar. 1985.
[30] L. Snyder, "Introduction to the configurable, highly parallel computer,"Comput., vol. 15, pp. 4756, Jan. 1982.
[31] L. Snyder, "Parallel programming and the Poker programming environment,"Comput., vol. 17, pp. 2736, July 1984.
[32] L. Snyder, "The XYZ abstraction levels of Pokerlike languages," in D. Gelernter, A. Nicolau, and D. Padua, Eds.,Languages and Compilers for Parallel Computing. Cambridge, MA: MIT Press, 1990, pp. 470488.
[33] H. S. Stone, "Multiprocessor scheduling with the aid of network flow algorithms,"IEEE Trans. Software Eng., vol. 3, pp. 8593, Jan. 1977.
[34] L. Tao, "Mapping parallel programs onto parallel systems with torus and mesh based communication structures," Tech. Rep. MSCIS8859 (Ph.D. dissertation), Dept. Comput. Inform. Sci., Univ. of Pennsylvania, 1988.
[35] L. Tao and E. Ma, "Simulating parallel neighboring communications among square meshes and square toruses,"J. Supercomput., vol. 5, pp. 5771, 1991.
[36] L. Tao, B. Narahari, and Y. C. Zhao, "Assigning task modules to processors in a distributed system,"J. Combinatorial Math. and Combinatorial Comput., to appear.
[37] L. Tao and Y. C. Zhao, "Multiway graph partition by stochastic probe," Comput. Operations Res., vol. 20, pp. 321347, 1993.
[38] A. Wagner, S. Chanson, N. Goldstein, J. Jiang, H. Larsen, and H. Sreekantaswamy, "TIPS: Transputerbased interactive parallelizing system," inProc. Transputing'91: 1st World Transputer Conf., 1991, pp. 2226.
[39] C. WhitbyStrevens, "The Transputer," inProc. 12th Annu. Symp. Comput. Architecture, Boston, MA, June 1985, pp. 292300.
[40] A. Y. Wu, "Embedding of tree networks into hypercubes,"J. Parallel Distrib. Comput., vol. 2, pp. 238249, 1985.
[41] T. Yang and A. Gerasoulis, "A parallel programming tool for scheduling on distributed memory multiprocessors," inProc. Scalable High Performance Comput. Conf. SHPCC92, 1992, pp. 350357.