Subscribe

Issue No.03 - March (2009 vol.21)

pp: 351-365

Lei Chen , Hong Kong University of Science and Technology, Hong Kong

Xiang Lian , Hong Kong University of Science and Technology, Hong Kong

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2008.146

ABSTRACT

Skyline query is of great importance in many applications, such as multi-criteria decision making and business planning. In particular, a skyline point is a data object in the database whose attribute vector is not dominated by that of any other objects. Previous methods to retrieve skyline points usually assume static data objects in the database (i.e. their attribute vectors are fixed), whereas several recent work focus on skyline queries with dynamic attributes. In this paper, we propose a novel variant of skyline queries, namely metric skyline, whose dynamic attributes are defined in the metric space (i.e. not limited to the Euclidean space). We illustrate an efficient and effective pruning mechanism to answer metric skyline queries through a metric index. Most importantly, we formalize the query performance of the metric skyline query in terms of the pruning power, by a cost model, in light of which we construct an optimized metric index aiming to maximize the pruning power of metric skyline queries. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed pruning techniques as well as the constructed index in answering metric skyline queries.

INDEX TERMS

Query processing, Multimedia databases, Indexing methods

CITATION

Lei Chen, Xiang Lian, "Efficient Processing of Metric Skyline Queries",

*IEEE Transactions on Knowledge & Data Engineering*, vol.21, no. 3, pp. 351-365, March 2009, doi:10.1109/TKDE.2008.146REFERENCES

- [1] R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim, “Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases,”
Proc. 21st Int'l Conf. Very Large Data Bases (VLDB), 1995.- [2] W. Balke, U. Guntzer, and J.X. Zheng, “Efficient Distributed Skylining for Web Information Systems,”
Proc. Advances in Database Technology (EDBT), 2004.- [3] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,”
Proc. 17th Int'l Conf. Data Eng. (ICDE), 2001.- [4] Y. Cai and R. Ng, “Indexing Spatio-Temporal Trajectories with Chebyshev Polynomials,”
Proc. ACM SIGMOD, 2004.- [5] C.-Y. Chan, P.-K. Eng, and K.-L. Tan, “Stratified Computation of Skylines with Partially-Ordered Domains,”
Proc. ACM SIGMOD, 2005.- [6] C.Y. Chan, H.V. Jagadish, K.-L. Tan, A.K.H. Tung, and Z. Zhang, “On High Dimensional Skylines,”
Proc. Advances in Database Technology (EDBT), 2006.- [7] L. Chen and X. Lian, “Dynamic Skyline Queries in Metric Spaces,”
Proc. Advances in Database Technology (EDBT), 2008.- [8] T. Chiueh, “Content-Based Image Indexing,”
Proc. 20th Int'l Conf. Very Large Data Bases (VLDB), 1994.- [9] J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with Presorting,”
Proc. 19th Int'l Conf. Data Eng. (ICDE), 2003.- [10] P. Ciaccia, A. Nanni, and M. Patella, “A Query-Sensitive Cost Model for Similarity Queries with M-Tree,”
Proc. 10th Australian Database Conf. (ADC), 1999.- [11] P. Ciaccia, M. Patella, and P. Zezula, “M-Tree: An Efficient Access Method for Similarity Search in Metric Spaces,”
Proc. 23rd Int'l Conf. Very Large Data Bases (VLDB), 1997.- [12] P. Ciaccia, M. Patella, and P. Zezula, “A Cost Model for Similarity Queries in Metric Spaces,”
Proc. ACM SIGACT-SIGMOD Symp. Principles of Database Systems (PODS), 1998.- [13] E. Dellis and B. Seeger, “Efficient Computation of Reverse Skyline Queries,”
Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.- [14] K. Deng, X. Zhou, and H.T. Shen, “Multi-Source Skyline Query Processing in Road Networks,”
Proc. 23rd Int'l Conf. Data Eng. (ICDE), 2007.- [15] H. Ferhatosmanoglu, E. Tuncel, D. Agrawal, and A.E. Abbadi, “High Dimensional Nearest Neighbor Searching,”
Information Systems, 2006.- [16] P. Godfrey, R. Shipley, and J. Gryz, “Maximal Vector Computation in Large Data Sets,”
Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.- [17] C.M. Grinstead and J.L. Snell,
Introduction to Probability. AMS, 1997.- [18] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,”
Proc. ACM SIGMOD, 1984.- [19] G. Jovanovic-Dolecek, “Demo Program for Central Limit Theorem,”
Proc. 40th Midwest Symp. Circuits and Systems (MWSCAS), 1997.- [20] M. Khalefa, M. Mokbel, and J. Levandoski, “Skyline Query Processing for Incomplete Data,”
Proc. 24th Int'l Conf. Data Eng. (ICDE), 2008.- [21] D. Kossmann, F. Ramsak, and S. Rost, “Shooting Stars in the Sky: An Online Algorithm for Skyline Queries,”
Proc. 28th Int'l Conf. Very Large Data Bases (VLDB), 2002.- [22] K. Lee, B. Zheng, H. Li, and W.-C. Lee, “Approaching the Skyline in Z Order,”
Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.- [23] X. Lin, Y. Yuan, W. Wang, and H. Lu, “Stabbing the Sky: Efficient Skyline Computation over Sliding Windows,”
Proc. 21st Int'l Conf. Data Eng. (ICDE), 2005.- [24] G.S. Manku, A. Jain, and A.D. Sarma, “Detecting Near-Duplicates for Web Crawling,”
Proc.16th Int'l World Wide Web Conf. (WWW), 2007.- [25] M. Morse, J. Patel, and H.V. Jagadish, “Efficient Skyline Computation over Low-Cardinality Domains,”
Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.- [26] R.T. Ng and J. Han, “Efficient and Effective Clustering Methods for Spatial Data Mining,”
Proc. 20th Int'l Conf. Very Large Data Bases (VLDB), 1994.- [27] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,”
Proc. ACM SIGMOD, 2003.- [28] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive Skyline Computation in Database Systems,”
ACM Trans. Database Systems, 2005.- [29] J. Pei, B. Jiang, X. Lin, and Y. Yuan, “Probabilistic Skylines on Uncertain Data,”
Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.- [30] J. Pei, W. Jin, M. Ester, and Y. Tao, “Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces,”
Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.- [31] H. Samet,
Foundations of Multidimensional and Metric Data Structures. Addison-Wesley, 2006.- [32] M. Sharifzadeh and C. Shahabi, “The Spatial Skyline Queries,”
Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB), 2006.- [33] A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-Based Image Retrieval at the End of the Early Years,”
IEEE Trans. Pattern Analysis and Machine Intelligence, 2000.- [34] K.-L. Tan, P.-K. Eng, and B.C. Ooi, “Efficient Progressive Skyline Computation,”
Proc. 27th Int'l Conf. Very Large Data Bases (VLDB), 2001.- [35] Y. Tao, D. Papadias, and X. Lian, “Reverse $k$ NN Search in Arbitrary Dimensionality,”
Proc. 30th Int'l Conf. Very Large Data Bases (VLDB), 2004.- [36] Y. Tao, X.K. Xiao, and J. Pei, “SUBSKY: Efficient Computation of Skylines in Subspaces,”
Proc. 22nd Int'l Conf. Data Eng. (ICDE), 2006.- [37] Y. Tao, M.L. Yiu, and N. Mamoulis, “Reverse Nearest Neighbor Search in Metric Spaces,”
IEEE Trans. Knowledge and Data Eng., 2006.- [38] E.W. Weisstein, “Central Limit Theorem,” http://mathworld.wolfram. comCentralLimitTheorem.html , 2008.
- [39] D.A. White and R. Jain, “Similarity Indexing with the SS-Tree,”
Proc. 12th Int'l Conf. Data Eng. (ICDE), 1996.- [40] Y. Yuan, X. Lin, W. Wang, J. Xu Yu, and Q. Zhang, “Efficient Computation of the Skyline Cube,”
Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005. |