
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
C.Y. Lin, J.S. Liu, Y.C. Chung, "Efficient Representation Scheme for Multidimensional Array Operations," IEEE Transactions on Computers, vol. 51, no. 3, pp. 327345, March, 2002.  
BibTex  x  
@article{ 10.1109/12.990130, author = {C.Y. Lin and J.S. Liu and Y.C. Chung}, title = {Efficient Representation Scheme for Multidimensional Array Operations}, journal ={IEEE Transactions on Computers}, volume = {51}, number = {3}, issn = {00189340}, year = {2002}, pages = {327345}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.990130}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Efficient Representation Scheme for Multidimensional Array Operations IS  3 SN  00189340 SP327 EP345 EPD  327345 A1  C.Y. Lin, A1  J.S. Liu, A1  Y.C. Chung, PY  2002 KW  array operations KW  multidimensional arrays KW  data structure KW  extended Karnaugh map representation KW  traditional matrix representation VL  51 JA  IEEE Transactions on Computers ER   
Array operations are used in a large number of important scientific codes, such as molecular dynamics, finite element methods, climate modeling, etc. To implement these array operations efficiently, many methods have been proposed in the literature. However, the majority of these methods are focused on the twodimensional arrays. When extended to higher dimensional arrays, these methods usually do not perform well. Hence, designing efficient algorithms for multidimensional array operations becomes an important issue. In this paper, we propose a new scheme,
[1] J.C. Adams, W.S. Brainerd, J.T. Martin, B.T. Smith, and J.L. Wagener, Fortran 90 Handbook. Intertext Publications/McGrawHill Inc. 1992.
[2] I. Banicescu and S.F. Hummel, “Balancing Processor Loads and Exploiting Data Locality in NBody Simulations,” Proc. 1995 ACM/IEEE Supercomputing Conf., Dec. 1995.
[3] D. Callahan, S. Carr, and K. Kennedy, “Improving Register Allocation for Subscripted Variables,” Proc. ACM SIGPLAN 1990 Conf. Programming Language Design and Implementation, pp. 5365, June 1990.
[4] S. Carr, K.S. McKinley, and C.W. Tseng, “Compiler Optimizations for Improving Data Locality,” Proc. Sixth Int'l Conf. Architectural Support for Programming Languages and Operating Systems, pp. 252262, Oct. 1994.
[5] L. Carter, J. Ferrante, and S.F. Hummel, “Hierarchical Tiling for Improved Superscalar Performance,” Proc. Nineth Int'l Symp. Parallel Processing, pp. 239245, Apr. 1995.
[6] S. Chatterjee, A.R. Lebeck, P.K. Patnala, and M. Thottethodi, “Recursive Array Layouts and Fast Parallel Matrix Multiplication,” Proc. Eleventh Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 222231, June 1999.
[7] S. Chatterjee, V.V. Jain, A.R. Lebeck, S. Mundhra, and M. Thottethodi, “Nonlinear Array Layouts for Hierarchical Memory Systems,” Proc. 1999 ACM Int'l Conf. Supercomputing, pp. 444453, June 1999.
[8] M. Cierniak and W. Li, “Unifying Data and Control Transformations for Distributed Shared Memory Machines,” Technical Report TR542, Dept. of Computer Science, Univ. of Rochester, Nov. 1994.
[9] S. Coleman and K. McKinley, “Tile Size Selection Using Cache Organization and Data Layout,” Proc. SIGPLAN Conf. Programming Language Design and Implementation, June 1995.
[10] J.K. Cullum and R.A. Willoughby, Algorithms for Large Symmetric Eignenvalue Computations, vol. 1.Boston, Mass.: Birkhauser, 1985.
[11] B.B. Fraguela, R. Doallo, and E.L. Zapata, “Cache Misses Prediction for High Performance Sparse Algorithms,” Proc. Fourth Int'l EuroPar Conf. (EuroPar '98), pp. 224233, Sept. 1998.
[12] B.B. Fraguela, R. Doallo, and E.L. Zapata, “Cache Probabilistic Modeling for Basic Sparse Algebra Kernels Involving Matrices with a NonUniform Distribution,” Proc. 24th IEEE Euromicro Conf., pp. 345348, Aug. 1998.
[13] B.B. Fraguela, R. Doallo, and E.L. Zapata, “Modeling Set Associative Caches Behaviour for Irregular Computations,” ACM Int'l Conf. Measurement and Modeling of Computer Systems (SIGMETRICS '98), pp. 192201, June 1998.
[14] B.B. Fraguela, R. Doallo, and E.L. Zapata, “Automatic Analytical Modeling for the Estimation of Cache Misses,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '99), Oct. 1999.
[15] J.D. Frens and D.S. Wise, “AutoBlocking MatrixMultiplication or Tracking BLAS3 Performance from Source Code,” Proc. Sixth ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, June 1997.
[16] G.H. Golub and C.F. Van Loan, Matrix Computations, Second ed. Baltimore, Md: Johns Hopkins Univ. Press, 1989.
[17] M. Kandemir, J. Ramanujam, and A. Choudhary, “Improving Cache Locality by a Combination of Loop and Data Transformations,” IEEE Trans. Computers, vol. 48, no. 2, pp. 159167, Feb. 1999. A preliminary version appears in Proc. 11th ACM Int'l Conf. Supercomputing (ICS '97), pp. 269276, July 1997.
[18] M. Kandemir, J. Ramanujam, and A. Choudhary, “A Compiler Algorithm for Optimizing Locality in Loop Nests,” Proc. 1997 ACM Int'l Conf. Supercomputing, pp. 269276, July 1997.
[19] C.W. Kebler and C.H. Smith, “The SPARAMAT Approach to Automatic Comprehension of Sparse Matrix Computations,” Proc. Seventh Int'l Workshop Program Comprehension, pp. 200207, 1999.
[20] V. Kotlyar, K. Pingali, and P. Stodghill, “Compiling Parallel Sparse Code for UserDefined Data Structures,” Proc. Eighth SIAM Conf. Parallel Processing for Scientific Computing, Mar. 1997.
[21] V. Kotlyar, K. Pingali, and P. Stodghill, “A Relation Approach to the Compilation of Sparse Matrix Programs,” Euro Par, Aug. 1997.
[22] V. Kotlyar, K. Pingali, and P. Stodghill, “Compiling Parallel Code for Sparse Matrix Applications,” Proc. Supercomputing Conf., Aug. 1997.
[23] B. Kumar, C.H. Huang, R.W. Johnson, and P. Sadayappan, “A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction,” Proc. Seventh Int'l Parallel Processing Symp., pp. 582588, Apr. 1993.
[24] M. Lam, E. Rothberg, and M. Wolf, “The Cache Performance and Optimizations of Blocked Algorithms,” Proc. Fourth Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS '91), 1991.
[25] W. Li and K. Pingali, “A Singular Loop Transformation Framework Based on NonSingular Matrices,” Proc. Fifth Workshop Languages and Compilers for Parallel Computers, pp. 249260, 1992.
[26] J.S. Liu, J.Y. Lin, and Y.C. Chung, “Efficient Representation for MultiDimensional Matrix Operations,” Proc. Workshop Compiler Techniques for High Performance Computing (CTHPC), pp. 133142, Mar. 2000.
[27] J.S. Liu, J.Y. Lin, and Y.C. Chung, “Efficient Parallel Algorithms for MultiDimensional Matrix Operations,” Proc. IEEE Int'l Symp. Parallel Architectures, Algorithms and Networks (ISPAN), pp.224229, Dec. 2000.
[28] K. McKinley, S. Carr, and C.W. Tseng, “Improving Data Locality with Loop Transformations,” ACM Trans. Programming Languages and Systems, vol. 18, no. 4, pp. 424453, July 1996.
[29] M. O'Boyle and P. Knijnenburg, “Integrating Loop and Data Transformations for Global Optimisation,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '98), Oct. 1998.
[30] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in Fortran 90: The Art of Parallel Scientific Computing. Cambridge Univ. Press, 1996.
[31] P.D. Sulatycke and K. Ghose, “Caching Efficient Multithreaded Fast Multiplication of Sparse Matrices,” Proc. First Merged Int'l Parallel Processing Symp. and Symp. Parallel and Distributed Processing, pp. 117123, 1998.
[32] M. Thottethodi, S. Chatterjee, and A.R. Lebeck, “Turing Strassen's Matrix Multiplication for Memory Efficiency,” Proc. ACM/IEEE SC98 Conf. High Performance Networking and Computing, Nov. 1998.
[33] M. Ujaldon, E.L. Zapata, S.D. Sharma, and J. Saltz, “Parallelization Techniques for Sparse Matrix Applications,” J. Parallel and Distribution Computing, 1996.
[34] M. Wolf and M. Lam, “A Data Locality Optimizing Algorithm,” Proc. SIGPLAN Conf. Programming Language Design and Implementation, pp. 3044, June 1991.
[35] L.H. Ziantz, C.C. Ozturan, and B.K. Szymanski, “RunTime Optimization of Sparse MatrixVector Multiplication on SIMD Machines,” Proc. Int'l Conf. Parallel Architectures and Languages, pp. 313322, July 1994.