The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - April (2008 vol.57)
pp: 567-573
ABSTRACT
Dilated integers form an ordered group of the Cartesian indices into a d\hbox{-}{\rm dimensional} array represented in the Morton order. Efficient implementations of its operations can be found elsewhere. This paper offers efficient casting (type)conversions to and from an ordinary integer representation. As the Morton order representation for 2D and 3D arrays attracts more users because of its excellent block locality, the efficiency of these conversions becomes important. They are essential for programmers who would use Cartesian indexing there. Two algorithms for each casting conversion are presented here, including to-and-from dilated integers for both d = 2 and d = 3. They fall into two families. One family uses newly compact table lookup, so the cache capacity is better preserved. The other generalizes better to all d, using processor-local arithmetic that is newly presented as abstract d\hbox{-}{\rm ary} and (d - 1)\hbox{-}{\rm ary} recurrences. Test results for two and three dimensions generally favor the former.
INDEX TERMS
Data Structures: Arrays, Programming Techniques: General, Memory Structures: Design Styles, Analysis of Algorithms, and Problem Complexity: Numerical algorithms, problems: computations on matrices.
CITATION
Rajeev Raman, "Converting to and from Dilated Integers", IEEE Transactions on Computers, vol.57, no. 4, pp. 567-573, April 2008, doi:10.1109/TC.2007.70814
REFERENCES
[1] G.M. Morton, “A Computer-Oriented Geodetic Data Base and a New Technique in File Sequencing,” technical report, IBM Ltd., Ottawa, Mar. 1966.
[2] D.S. Wise, “Ahnentafel Indexing into Morton-Ordered Arrays, or Matrix Locality for Free,” Euro-Par 2000—Parallel Processing, A. Bode, T.Ludwig, W. Karl, and R. Wismüller, eds., pp. 774-883, Springer, 2000. http://dx.doi.org/10.10073-540-44520-X_108
[3] J.D. Frens and D.S. Wise, “QR Factorization with Morton-Ordered Quadtree Matrices for Memory Reuse and Parallelism,” Proc. Ninth ACM SIGPLAN Symp. Principles and Practice of Parallel Program, SIGPLAN Not. vol. 38, no. 10, pp. 144-154, Oct. 2003.
[4] J. Sang Park, M. Penner, and V.K. Prasanna, “Optimizing Graph Algorithms for Improved Cache Performance,” IEEE Trans. Parallel Distrib. Syst., vol. 15, no. 9, pp. 769-782, Sept. 2004.
[5] M. Frigo, C.E. Leiserson, H. Prokop, and S. Ramachandran, “Cache-Oblivious Algorithms,” Proc. 40th Ann. Symp. Foundations of Computer Science, pp. 285-298, Oct. 1999.
[6] M.D. Adams and D.S. Wise, “Seven at One Stroke: Results from a Cache-Oblivious Paradigm for Scalable Matrix Algorithms,” Proc. 2006 Workshop Memory System Performance and Correctness, pp. 41-50, Oct. 2006. http://dx.doi.org/10.1145/781498.781525http:/ /dx.doi.org/10.1109/TPDS.2004.44http:/ /dx.doi.org/10.1109/SFFCS.1999.814600http:/ /dx.doi.org/10.11451178597.1178604
[7] J.G. Siek and A. Lumsdaine, “The Matrix Template Library: Generic Components for High-Performance Scientific Computing,” Computing in Science and Eng., vol. 1, no. 6, pp. 70-78, Nov. 1999.
[8] W.D. Clinger, “How to Read Floating-Point Numbers Accurately,” SIGPLAN Not., vol. 39, no. 4,best of PLDI, pp. 360-371, Apr. 2004, originally published vol. 25, no. 6, pp. 92-101, June 1990.
[9] G.L. Steele Jr. and J.L. White, “How to Print Floating-Point Numbers Accurately,” SIGPLAN Not., vol. 39, no. 4,best of PLDI, pp. 372-389, Apr. 2004, originally published vol. 25, no. 6, pp. 112-126, June 1990.
[10] M. Thottethodi, S. Chatterjee, and A.R. Lebeck, “Tuning Strassen's Matrix Multiplication for Memory Efficiency,” Proc. Supercomputing, chapter 36, Nov. 1998. http://dx.doi.org/10.1109/5992.805137http:/ /dx.doi.org/10.1145/989393.989430http:/ /dx.doi.org/10.1145/989393.989431http:/ /dx.doi.org/10.1109SC.1998.10045
[11] V. Valsalam and A. Skjellum, “A Framework for High-Performance Matrix Multiplication Based on Hierarchical Abstractions, Algorithms and Optimized Low-Level Kernels,” Concur. Comp. Prac. Exper., vol. 14, no. 10, pp. 805-839, 2002. http://dx.doi.org/10.1002cpe.630
[12] G. Schrack, “Finding Neighbors of Equal Size in Linear Quadtrees and Octrees in Constant Time,” CVGIP: Image Underst., vol. 55, no. 3, pp.221-230, May 1992.
[13] M.D. Adams and D.S. Wise, “Fast Additions on Masked Integers,” SIGPLAN Not., vol. 41, no. 5, pp. 39-45, May 2006.
[14] D.E. Knuth, Fundamental Algorithms, third ed., vol. 1, series the Art of Computer Programming. Addison-Wesley, 1997.
[15] G. Peano, “Sur une Courbe, Qui Remplit Toute une Aire Plaine,” Math. Ann., vol. 36, pp. 157-160, 1890.
[16] K.D. Tocher, “The Application of Automatic Computers to Sampling Experiments,” J. Roy. Statist. Soc. Ser. B, vol. 16, no. 1, pp. 39-61, 1954.
[17] H. Samet, The Design and Analysis of Spatial Data Structures, section 2.7. Addison-Wesley, 1990.
[18] F.C. Holroyd and D.C. Mason, “Efficient Linear Quadtree Construction Algorithm,” Image Vision Comput., vol. 8, no. 3, pp. 218-224, Aug. 1990.
[19] L. Stocco and G. Schrack, “Integer Dilation and Contraction for Quadtrees and Octrees,” Proc. IEEE Pacific Rim Conf. Comm., Computers, and Signal Processing, pp. 426-428, May 1995.
[20] P. Merkey, “Z-Ordering and UPC,” technical report, Michigan Technological Univ., http://dx.doi.org/10.1145/1149982.1149987http:/ /dx.doi.org/10.1109/PACRIM.1995.519560http:/ /www.upc.mtu.edu/paperszorder.pdf , June 2003.
[21] AMD Athlon Processor x86 Code Optimization Guide. Advanced Micro Devices, Sunnyvale, Calif., publication 22007, http://www.amd.com/us-en/assets/content_type/ white_papers_and_tech_docs22007.pdf , Feb. 2002.
25 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool