This Article 
 Bibliographic References 
 Add to: 
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve
January/February 2001 (vol. 13 no. 1)
pp. 124-141

Abstract—Several schemes for the linear mapping of a multidimensional space have been proposed for various applications, such as access methods for spatio-temporal databases and image compression. In these applications, one of the most desired properties from such linear mappings is clustering, which means the locality between objects in the multidimensional space being preserved in the linear space. It is widely believed that the Hilbert space-filling curve achieves the best clustering [1], [14]. In this paper, we analyze the clustering property of the Hilbert space-filling curve by deriving closed-form formulas for the number of clusters in a given query region of an arbitrary shape (e.g., polygons and polyhedra). Both the asymptotic solution for the general case and the exact solution for a special case generalize previous work [14]. They agree with the empirical results that the number of clusters depends on the hypersurface area of the query region and not on its hypervolume. We also show that the Hilbert curve achieves better clustering than the z curve. From a practical point of view, the formulas given in this paper provide a simple measure that can be used to predict the required disk access behaviors and, hence, the total access time.

[1] D.J. Abel and D.M. Mark, “A Comparative Analysis of Some Two-Dimensional Orderings,” Int'l J. Geographical Information Systems, vol. 4, no. 1, pp. 21–31, Jan. 1990.
[2] J.J. Bartholdi and L.K. Platzman, “An O($n\log n$) Travelling Salesman Heuristic Based on Spacefilling Curves,” Operation Research Letters, vol. 1, no. 4, pp. 121–125, Sept. 1982.
[3] M. Berger, Geometry II. New York: Springer-Verlag, 1987.
[4] T. Bially, “Space-Filling Curves: Their Generation and Their Application to Bandwidth Reduction,” IEEE Trans. Information Theory, vol. 15, no. 6, pp. 658–664, Nov. 1969.
[5] A.R. Butz, “Convergence with Hilbert's Space Filling Curve,” J. Computer and System Sciences, vol. 3, pp. 128–146, 1969.
[6] I.S. Duff, “Design Features of a Frontal Code for Solving Sparse Unsymmetric Linear Systems Out-of-Core,” SIAM J. Scientific Statistical Computing, vol. 5, no. 2, pp. 270–280, June 1984.
[7] C.R. Dyer, “The Space Efficiency of Quadtrees,” Computer Graphics and Image Processing, vol. 19, no. 4, pp. 335–348, Aug. 1982.
[8] C. Faloutsos, “Multiattribute Hashing Using Gray Codes,” Proc. 1986 ACM SIGMOD Conf., pp. 227–238, May 1986.
[9] C. Faloutsos, "Analytical Results on the Quadtree Decomposition of Arbitrary Rectangles," Pattern Recognition Letters, vol. 13, no. 1, pp. 31-40, Jan. 1992.
[10] C. Faloutsos, H.V. Jagadish, and Y. Manolopoulos, “Analysis of the N-Dimensional Quadtree Decomposition for Arbitrary Hyperrectangles,” IEEE Trans. Knowledge and Data Eng., vol. 9, no. 3, pp. 373–383, May/June 1997.
[11] C. Faloutsos and S. Roseman, "Fractals for Secondary Key Retrival," Proc. Symp. Principles of Database Systems, SIGMOD-SIGACT PODS, 1989.
[12] P.J. Giblin, Graphs, Surfaces, and Homology, second ed. New York: Chapman and Hall, 1981.
[13] D. Hilbert, “Über die stetige Abbildung einer Linie auf Flächenstück,” Math. Ann., vol. 38, pp. 459–460, 1891.
[14] H.V. Jagadish, "Linear Clustering of Objects with Multiple Attributes," Proc. Int'l Conf. Management of Data, pp. 332-342, ACM SIGMOD, 1990.
[15] H.V. Jagadish, "Spatial Search with Polyhedra," Proc. Sixth IEEE Int'l Conf. Data Eng., Feb. 1990.
[16] H.V. Jagadish, “Analysis of the Hilbert Curve for Representing Two-Dimensional Space,” Information Processing Letters, vol. 62, no. 1, pp. 17–22, Apr. 1997.
[17] M. Kaddoura, C.W. Qu, and S. Ranka, “Partitioning Unstructured Computational Graphs for Nonuniform and Adaptive Environments,” IEEE Parallel and Distributed Technology, pp. 63–69, 1995.
[18] A. Lempel and J. Ziv, “Compression of Two-Dimensional Images,” NATO ASI Series, vol. F12, pp. 141–154, June 1984.
[19] J. Orenstein, “Spatial Query Processing in an Object-Oriented Database System,” Proc. Fifth ACM-SIGMOD Conf., pp. 326-336, 1986.
[20] E.A. Patrick, D.R. Anderson, and F.K. Bechtel, “Mapping MultiDimensional Space to One Dimension for Computer Output Display,” IEEE Trans. Computers, vol. 17, no. 10, pp. 949–953, Oct. 1968.
[21] G. Peano, “Sur une Courbe qui Remplit Toute une Aire Plane,” Math. Ann., vol. 36, pp. 157–160, 1890.
[22] R.L. Rivest, “Partial Match Retrieval Algorithms,” SIAM J. Computing, vol. 5, no. 1, pp. 19–50, Mar. 1976.
[23] Y. Rong and C. Faloutsos, "Analysis of the Clustering Property of Peano Curves," Technical Report CS-TR-2792, UMIACS-TR-91-151, Univ. of Maryland, Dec. 1991.
[24] J.B. Rothnie and T. Lozano, “Attribute Based File Organization in a Paged Memory Environment,” Comm. ACM, vol. 17, no. 2, pp. 63–69, Feb. 1974.
[25] C. Ruemmler and J. Wilkes, "An Introduction to Disk Drive Modeling," Computer, vol. 27, no. 3, pp. 17-28, Mar. 1994.
[26] H. Sagan, “A Three-Dimensional Hilbert Curve,” Int'l J. Math. Ed. Science Technology, vol. 24, pp. 541–545, 1993.
[27] C.A. Shaffer, "A Formula for Computing the Number of Quadtree Node Fragments Created by a Shift," Pattern Recognition Letters, vol. 7, no. 1, pp. 45-49, Jan. 1988.
[28] G.F. Simmons, Introduction to Topology and Modern Analysis. New York: McGraw-Hill Book Company, Inc., 1963.

Index Terms:
Locality-preserving linear mapping, range queries, multiattribute access methods, data clustering, Hilbert curve, space-filling curves, fractals.
Bongki Moon, H.v. Jagadish, Christos Faloutsos, Joel H. Saltz, "Analysis of the Clustering Properties of the Hilbert Space-Filling Curve," IEEE Transactions on Knowledge and Data Engineering, vol. 13, no. 1, pp. 124-141, Jan.-Feb. 2001, doi:10.1109/69.908985
Usage of this product signifies your acceptance of the Terms of Use.