
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Edin Hodzic, Weijia Shang, "On Supernode Transformation with Minimized Total Running Time," IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 5, pp. 417428, May, 1998.  
BibTex  x  
@article{ 10.1109/71.679213, author = {Edin Hodzic and Weijia Shang}, title = {On Supernode Transformation with Minimized Total Running Time}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {9}, number = {5}, issn = {10459219}, year = {1998}, pages = {417428}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.679213}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Parallel and Distributed Systems TI  On Supernode Transformation with Minimized Total Running Time IS  5 SN  10459219 SP417 EP428 EPD  417428 A1  Edin Hodzic, A1  Weijia Shang, PY  1998 KW  Supernode partitioning KW  tiling KW  parallelizing compilers KW  distributed memory multicomputer KW  minimizing running time. VL  9 JA  IEEE Transactions on Parallel and Distributed Systems ER   
Abstract—With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses how to find an optimal supernode size and optimal supernode relative side lengths of a supernode transformation (also known as tiling). We identify three parameters of supernode transformation: supernode size, relative side lengths, and cutting hyperplane directions. For algorithms with perfectly nested loops and uniform dependencies, for sufficiently large supernodes and number of processors, and for the case where multiple supernodes are mapped to a single processor, we give an order
[1] C. Ancourt and R. Triolet, "Scanning Polyhedra with Do Loops," Proc. Third ACM Symp. Principles and Practice of Parallel Programming, pp. 3950, 1991.
[2] R. Andonov and S. Rajopadhye, "Optimal Tiling," Technical Report PI792, IRISA, Campus de Beaulieu, Rennes, France, Jan. 1994.
[3] A.L. Beguelin, J.J. Dongarra, G.A. Geist, W.C. Jiang, R.J. Manchek, B.K. Moore, and V.S. Sunderam, PVM version 3.3: Parallel Virtual Machine System.Knoxville, Tenn.: Univ. of Tennessee, Oak Ridge Tenn.: Oak Ridge Nat'l Laboratory, Atlanta, Ga.: Emory Univ., 1994.
[4] P. Boulet, A. Darte, T. Risset, and Y. Robert, "(Pen)Ultimate Tiling," Integration, VLSI J., vol. 17, pp. 3351, 1994.
[5] G. Goff, K. Kennedy, and C. Tseng, "Practical Dependence Testing," Proc. SIGPLAN '91 Conf. Programming Language Design and Implementation, pp. 1529,Toronto, Canada, June 1991.
[6] A. Darte, L. Khachiyan, and Y. Robert, "Linear Scheduling is Nearly Optimal," Parallel Processing Letters, vol. 1.2, pp. 7381, 1991.
[7] E. Hodzic and W. Shang, “On Optimal Size and Shape of Supernode Transformations,” Proc. 1996 Int'l Conf. Parallel Processing, pp. III25III34, Aug. 1996.
[8] E. Hodzic and W. Shang, "On Supernode Transformations with Minimized Total Running Time," Proc. Int'l Conf. Application Specific Systems, Architectures, and Processors, pp. 402414,Chicago, Aug. 1996.
[9] F. Irigoin and R. Triolet, “Supernode Partitioning,” Proc. 15th ACM Symp. Principles of Programming Languages, pp. 319329, Jan. 1988.
[10] M. Lam, E. Rothberg, and M. Wolf, “The Cache Performance and Optimizations of Blocked Algorithms,” Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS '91), 1991.
[11] H. Ohta, Y. Saito, M. Kainaga, and H. Ono, "Optimal Tile Size Adjustment in Compiling General DOACROSS Loop Nests," Proc. 1995 Int'l Conf. Supercomputing, pp. 270279. ACM Press, 1995.
[12] M.J. Quinn, Parallel Computing: Theory and Practice.New York: McGrawHill, 1994.
[13] J. Ramanujam and P. Sadayappan, "Tiling Multidimensional Iteration Spaces for Multicomputers," J. Parallel and Distributed Computing, vol. 16, pp. 108120, 1992.
[14] R. Schreiber and J.J. Dongarra, "Automatic Blocking of Nested Loops," Technical Report 90.38, RIACS, Aug. 1990.
[15] W. Shang and J.A.B. Fortes, "Time Optimal Linear Schedules for Algorithms with Uniform Dependencies," IEEE Trans. Computers, vol. 40, June 1991.
[16] W. Shang and J.A.B. Fortes, "Independent Partitioning of Algorithms with Uniform Dependencies," IEEE Trans. Computers, vol. 41, no. 2, pp. 190206, Feb. 1992.
[17] B. Sinharoy and B. Szymanski, "Finding Optimum Wavefront of Parallel Computation," J. Parallel Algorithms and Applications, vol. 2, no. 1, pp. 526, 1994.
[18] M.R. Steed and M.J. Clement, "Performance Prediction of PVM Programs," Proc. IPPS, pp. 803807, 1996.
[19] M. Wolfe, “More Iteration Space Tiling,” Proc. Supercomputing '89, pp. 655664, Nov. 1989.
[20] J. Xue, "On Tiling as a Loop Transformation," Parallel Processing Letters, vol. 7, no. 4, pp. 409424, 1997.