|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| W. Shang, M.T. O'Keefe, J.A.B. Fortes, "On Loop Transformations for Generalized Cycle Shrinking," IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 2, pp. 193-204, February, 1994. | |||
| BibTex | x | ||
| @article{ 10.1109/71.265946, author = {W. Shang and M.T. O'Keefe and J.A.B. Fortes}, title = {On Loop Transformations for Generalized Cycle Shrinking}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {5}, number = {2}, issn = {1045-9219}, year = {1994}, pages = {193-204}, doi = {http://doi.ieeecomputersociety.org/10.1109/71.265946}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - On Loop Transformations for Generalized Cycle Shrinking IS - 2 SN - 1045-9219 SP193 EP204 EPD - 193-204 A1 - W. Shang, A1 - M.T. O'Keefe, A1 - J.A.B. Fortes, PY - 1994 KW - Index Termsscheduling; program compilers; loop transformations; generalized cycle shrinking;parallelism; nested loop structures; selective cycle shrinking; linear scheduling;conflict-free mappings VL - 5 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
This paper describes several loop transformation techniques for extracting parallelism from nested loop structures. Nested loops can then be scheduled to run in parallel so thatexecution time is minimized. One technique is called selective cycle shrinking, and theother is called true dependence cycle shrinking. It is shown how selective shrinking isrelated to linear scheduling of nested loops and how true dependence shrinking is relatedto conflict-free mappings of higher dimensional algorithms into lower dimensionalprocessor arrays. Methods are proposed in this paper to find the selective and truedependence shrinkings with minimum total execution time by applying the techniques offinding optimal linear schedules and optimal and conflict-free mappings proposed by W.Shang and A.B. Fortes.
[1] U. Banerjee, "Unimodular transformations of double loops," inProc. 3rd Workshop Advances in Languages and Compilers for Parallel Computing, A. Nicolau, D. Gelernter, T. Gross, and D. Padua, Eds. Cambridge, MA: MIT Press, 1990, pp. 192-219.
[2] M. S. Bazaraa and C. M. Shetty,Nonlinear Programming: Theory and Algorithms. New York: Wiley, 1979.
[3] M. Byler, J. R. B. Davies, C. Huson, B. Leasure, and M. Wolfe, "Multiple version loops," inProc. Int. Conf. Parallel Processing, St. Charles, IL, Aug. 1987, pp. 312-318.
[4] R. Cytron, "Doacross: Beyond vectorization for multiprocessors," inProc. Int. Conf. Parallel Processing, St. Charles, IL, Aug. 1986, pp. 836-844.
[5] J. Fortes and B. W. Wah, "Systolic arrays--From concept to implementation,"IEEE Comput. Mag., vol. 20, pp. 12-17, July 1987.
[6] R. Karp, R. Miller, and S. Winograd, "The Organization of Computations for Uniform Recurrence Equations,"J. ACM, Vol. 14, No. 3, 1967, pp. 563-590.
[7] L. Lamport, "The parallel execution of DO loops,"Commun. ACM, vol. 17, no. 2, pp. 83-93, Feb. 1974.
[8] L. J. Mordell,Diophantine Equations. New York: Academic, 1969, p. 30.
[9] M. T. O'Keefe and H. G. Dietz, "Loop coalescing and scheduling for barrier MIMD architectures." to appear inIEEE Trans. Parallel Distributed Syst.
[10] M. T. O'Keefe and H. G. Dietz, "Hardware barrier synchronization: Static barrier MIMD," inProc. Int. Conf. Parallel Processing, Aug. 1990, St. Charles, IL, vol. I, pp. 35-42.
[11] J.-K. Peir and R. Cytron, "Minimum distance: A method for partitioning recurrences for multiprocessors,"IEEE Trans. Comput., vol. 38, no. 8, pp. 1203-1211, Aug. 1989.
[12] C. D. Polychronopoulos, "Compiler optimizations for enhancing parallelism and their impact on architecture design,"IEEE Trans. Comput., vol. 37, no. 8, pp. 991-1004, Aug. 1988.
[13] C. Polychronopoulos,Parallel Programming and Compilers, Kluwer Academic Publishers, 1988.
[14] W. Shang and J. A. B. Fortes, "Time optimal linear schedules for algorithms with uniform dependencies," inProc. Int. Conf. Systolic Arrays, May 1988, pp. 393-402.
[15] W. Shang and J. A. B. Fortes, "On optimality of linear schedules,"J. VLSI Signal Processing, vol. 1, pp. 209-220. Boston: Kluwer.
[16] W. Shang and J. A. B. Fortes, "Time-optimal and conflict-free mappings of uniform dependence algorithms into lower dimensional processor arrays," inProc. Int. Conf. Parallel Processing, Aug. 1990, pp. 101-110 (I). (Also to appear inIEEE Trans. Parallel and Distributed Syst.).
[17] W. Shang and J. A. B. Fortes, "Time Optimal Linear Schedules for Algorithms with Uniform Dependencies,"IEEE Trans. on Computers, June 1991, pp. 723-742.
[18] W. Shang and J. A. B. Fortes, "Independent partitioning of uniform dependency algorithms,"IEEE Trans. Comput., vol. 41, no. 2, Feb. 1992, pp. 190-206.
[19] M. J. Wolfe, "Loop skewing: The wavefront method revisited,"Int. J. Parallel Programming, vol. 15, no. 4, pp. 279-293, Aug. 1986.
[20] M. Wolfe,Optimizing Supercompilers for Supercomputers. Cambridge MA: MIT Press, 1989.

