Subscribe
Issue No.01 - January (2011 vol.23)
pp: 139-154
Ke Yi , Hong Kong University of Science and Technology, Hong Kong
Xiang Lian , Hong Kong University of Science and Technology, Hong Kong
Feifei Li , Florida State University , Tallahassee, FL
Lei Chen , Hong Kong University of Science and Technology, Hong Kong
ABSTRACT
With the advance of wireless communication technology, it is quite common for people to view maps or get related services from the handheld devices, such as mobile phones and PDAs. Range queries, as one of the most commonly used tools, are often posed by the users to retrieve needful information from a spatial database. However, due to the limits of communication bandwidth and hardware power of handheld devices, displaying all the results of a range query on a handheld device is neither communication-efficient nor informative to the users. This is simply because that there are often too many results returned from a range query. In view of this problem, we present a novel idea that a concise representation of a specified size for the range query results, while incurring minimal information loss, shall be computed and returned to the user. Such a concise range query not only reduces communication costs, but also offers better usability to the users, providing an opportunity for interactive exploration. The usefulness of the concise range queries is confirmed by comparing it with other possible alternatives, such as sampling and clustering. Unfortunately, we prove that finding the optimal representation with minimum information loss is an NP-hard problem. Therefore, we propose several effective and nontrivial algorithms to find a good approximate result. Extensive experiments on real-world data have demonstrated the effectiveness and efficiency of the proposed techniques.
INDEX TERMS
Spatial databases, range queries, algorithms.
CITATION
Ke Yi, Xiang Lian, Feifei Li, Lei Chen, "The World in a Nutshell: Concise Range Queries", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 1, pp. 139-154, January 2011, doi:10.1109/TKDE.2010.35
REFERENCES
 [1] X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang, "Selecting Stars: The k Most Representative Skyline Operator," Proc. Int'l Conf. Data Eng. (ICDE), 2007. [2] C. Jermaine, S. Arumugam, A. Pol, and A. Dobra, "Scalable Approximate Query Processing with the dbo Engine," Proc. ACM SIGMOD, 2007. [3] G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis, "Fast Data Anonymization with Low Information Loss," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007. [4] G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu, "Achieving Anonymity via Clustering," Proc. Symp. Principles of Database Systems (PODS), 2006. [5] J. Xu, W. Wang, J. Pei, X. Wang, B. Shi, and A.W.-C. Fu, "Utility-Based Anonymization Using Local Recoding," Proc. ACM SIGKDD, 2006. [6] C. Böhm, C. Faloutsos, J.-Y. Pan, and C. Plant, "RIC: Parameter-Free Noise-Robust Clustering," ACM Trans. Knowledge Discovery from Data, vol. 1, no. 3, pp. 10-1-10-28, 2007. [7] R.T. Ng and J. Han, "Efficient and Effective Clustering Methods for Spatial Data Mining," Proc. Int'l Conf. Very Large Data Bases (VLDB), 1994. [8] D. Lichtenstein, "Planar Formulae and Their Uses," SIAM J. Computing, vol. 11, no. 2, pp. 329-343, 1982. [9] R. Tamassia and I.G. Tollis, "Planar Grid Embedding in Linear Time," IEEE Trans. Circuits and Systems, vol. 36, no. 9, pp. 1230-1234, Sept. 1989. [10] H.V. Jagadish, B.C. Ooi, K.-L. Tan, C. Yu, and R. Zhang, "iDistance: An Adaptive B+-Tree Based Indexing Method for Nearest Neighbor Search," ACM Trans. Database Systems, vol. 30, no. 2, pp. 364-397, 2005. [11] H. Samet, The Design and Analysis of Spatial Data Structures. Addison-Wesley Longman Publishing Co., Inc., 1990. [12] B. Moon, H.v. Jagadish, C. Faloutsos, and J.H. Saltz, "Analysis of the Clustering Properties of the Hilbert Space-Filling Curve," IEEE Trans. Knowledge and Data Eng., vol. 13, no. 1, pp. 124-141, Jan. 2001. [13] A. Guttman, "R-Trees: A Dynamic Index Structure for Spatial Searching," Proc. ACM SIGMOD, 1984. [14] N. Beckmann, H.P. Kriegel, R. Schneider, and B. Seeger, "The R$^{\ast}$ -Tree: An Efficient and Robust Access Method for Points and Rectangles," Proc. ACM SIGMOD, 1990. [15] T. Zhang, R. Ramakrishnan, and M. Livny, "BIRCH: An Efficient Data Clustering Method for Very Large Databases," Proc. ACM SIGMOD, 1996. [16] V. Ganti, R. Ramakrishnan, J. Gehrke, and A. Powell, "Clustering Large Datasets in Arbitrary Metric Spaces," Proc. Int'l Conf. Data Eng. (ICDE), 1999. [17] K. Mouratidis, D. Papadias, and S. Papadimitriou, "Tree-Based Partition Querying: A Methodology for Computing Medoids in Large Spatial Datasets," VLDB J., vol. 17, no. 4, pp. 923-945, 2008. [18] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise," Proc. Int'l Conf. Knowledge Discovery and Data Mining (KDD), 1996. [19] M.L. Yiu and N. Mamoulis, "Clustering Objects on a Spatial Network," Proc. ACM SIGMOD, 2004. [20] C.S. Jensen, D. Lin, B.C. Ooi, and R. Zhang, "Effective Density Queries on Continuously Moving Objects," Proc. Int'l Conf. Data Eng. (ICDE), 2006. [21] C.R. Palmer and C. Faloutsos, "Density Biased Sampling: An Improved Method for Data Mining and Clustering," Proc. ACM SIGMOD, 2000. [22] K. Yi, X. Lian, F. Li, and L. Chen, "The World in a Nutshell: Concise Range Queries," Proc. Int'l Conf. Data Eng. (ICDE), 2009. [23] P.K. Agarwal, L. Arge, and J. Erickson, "Indexing Moving Points," Proc. Symp. Principles of Database Systems (PODS), 2000. [24] Y. Tao, D. Papadias, and J. Sun, "The TPR∗-Tree: An Optimized Spatio-Temporal Access Method for Predictive Queries," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2003. [25] N. Dalvi and D. Suciu, "Efficient Query Evaluation on Probabilistic Databases," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2004. [26] R. Cheng, D. Kalashnikov, and S. Prabhakar, "Evaluating Probabilistic Queries over Imprecise Data," Proc. ACM SIGMOD, 2003. [27] A.D. Sarma, O. Benjelloun, A. Halevy, and J. Widom, "Working Models for Uncertain Data," Proc. Int'l Conf. Data Eng. (ICDE), 2006. [28] Y.E. Ioannidis and V. Poosala, "Balancing Histogram Optimality and Practicality for Query Result Size Estimation," Proc. ACM SIGMOD, 1995. [29] S. Acharya, V. Poosala, and S. Ramaswamy, "Selectivity Estimation in Spatial Databases," Proc. ACM SIGMOD, 1999. [30] H.V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K.C. Sevcik, and T. Suel, "Optimal Histograms with Quality Guarantees," Proc. Int'l Conf. Very Large Data Bases (VLDB), 1998. [31] T. Brinkhoff, "A Framework for Generating Network-Based Moving Objects," Geoinformatica, vol. 6, pp. 153-180, 2002.