The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2012 vol.24)
pp: 492-505
Man Lung Yiu , Hong Kong Polytechnic University, Hong Kong
Eric Lo , Hong Kong Polytechnic University, Hong Kong
Duncan Yung , Hong Kong Polytechnic University, Hong Kong
ABSTRACT
Data cube is a key element in supporting fast OLAP. Traditionally, an aggregate function is used to compute the values in data cubes. In this paper, we extend the notion of data cubes with a new perspective. Instead of using an aggregate function, we propose to build data cubes using the skyline operation as the “aggregate function.” Data cubes built in this way are called “group-by skyline cubes” and can support a variety of analytical tasks. Nevertheless, there are several challenges in implementing group-by skyline cubes in data warehouses: 1) the skyline operation is computational intensive, 2) the skyline operation is holistic, and 3) a group-by skyline cube contains both grouping and skyline dimensions, rendering it infeasible to precompute all cuboids in advance. This paper gives details on how to store, materialize, and query such cubes.
INDEX TERMS
Query processing; data warehouse and repository.
CITATION
Man Lung Yiu, Eric Lo, Duncan Yung, "Measuring the Sky: On Computing Data Cubes via Skylining the Measures", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 3, pp. 492-505, March 2012, doi:10.1109/TKDE.2010.253
REFERENCES
[1] W.T. Balke, U. Güntzer, and J.X. Zheng, "Efficient Distributed Skylining for Web Information Systems," Proc. Int'l Conf. Extending Database Technology (EDBT), 2004.
[2] I. Bartolini, P. Ciaccia, and M. Patella, "Efficient Sort-Based Skyline Evaluation," ACM Trans. Database Systems, vol. 33, no. 4, 2008.
[3] K.S. Beyer et al., "Bottom-Up Computation of Sparse and Iceberg CUBEs," Proc. ACM SIGMOD Int'l Conf. Management of Data, 1999.
[4] S. Börzsönyi et al., "The Skyline Operator," Proc. Int'l Conf. Data Eng. (ICDE), 2001.
[5] C.-Y. Chan, H. Jagadish, K.-L. Tan, A. Tung, and Z. Zhang, "Finding k-Dominant Skylines in High Dimensional Space," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2006.
[6] C.-Y. Chan, H. Jagadish, K.-L. Tan, A. Tung, and Z. Zhang, "On High Dimensional Skylines," Proc. Int'l Conf. Extending Database Technology (EDBT), 2006.
[7] S. Chaudhuri et al., "Robust Cardinality and Cost Estimation for Skyline Operator," Proc. Int'l Conf. Data Eng. (ICDE), 2006.
[8] J. Chomicki et al., "Skyline with Presorting," Proc. Int'l Conf. Data Eng. (ICDE), 2003.
[9] B. Cui, H. Lu, Q. Xu, L. Chen, Y. Dai, and Y. Zhou, "Parallel Distributed Processing of Constrained Skyline Queries by Filtering," Proc. IEEE Int'l Conf. Data Eng. (ICDE), 2008.
[10] E. Dellis et al., "Efficient Computation of Reverse Skyline Queries," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2007.
[11] P. Godfrey, "Skyline Cardinality for Relational Processing," Proc. Int'l Symp. Foundations of Information and Knowledge Systems (FoIKS), 2004.
[12] P. Godfrey, R. Shipley, and J. Gryz, "Algorithms and Analysis for Maximal Vector Computation," VLDB J., vol. 16, no. 1, pp. 5-28, 2007.
[13] J. Gray et al., "Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total," Proc. Int'l Conf. Data Eng. (ICDE), 1996.
[14] A. Gupta and I.S. Mumick, Materialized Views: Techniques, Implementations, and Applications. MIT Press, 1999.
[15] V. Harinarayan et al., "Implementing Data Cubes Efficiently," Proc. ACM SIGMOD Int'l Conf. Management of Data, 1996.
[16] W. Jin, A.K.H. Tung, M. Ester, and J. Han, "On Efficient Processing of Subspace Skyline Queries on High Dimensional Data," Proc. Int'l Conf. Scientific and Statistical Database Management (SSDBM), 2007.
[17] D. Kossmann, F. Ramsak, and S. Rost, "Shooting Stars in the Sky: An Online Algorithm for Skyline Queries," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2002.
[18] A.R. Krommer and C.W. Ueberhuber, Numerical Integration on Advanced Computer Systems. Springer, 1994.
[19] H.T. Kung et al., "On Finding the Maxima of a Set of Vectors," J. ACM, vol. 22, no. 4, pp. 469-476, 1975.
[20] C. Li, B.C. Ooi, A. Tung, and S. Wang, "DADA: A Data Cube for Dominant Relationship Analysis," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2006.
[21] X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang, "Selecting Stars: The k Most Representative Skyline Operator," Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[22] M.-H. Luk et al., "Group-By Skyline Query Processing in Relational Engines," Proc. ACM Conf. Information and Knowledge Management (CIKM), 2009.
[23] D. Papadias et al., "Progressive Skyline Computation in Database Systems," ACM Trans. Database Systems, vol. 30, no. 1, pp. 41-82, 2005.
[24] J. Pei et al., "Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2005.
[25] J. Pei et al., "Towards Multidimensional Subspace Skyline Analysis," ACM Trans. Database Systems, vol. 31, no. 4, pp. 1335-1381, 2006.
[26] J. Pei et al., "Computing Compressed Multidimensional Skyline Cubes Efficiently," Proc. Int'l Conf. Data Eng. (ICDE), 2007.
[27] K.A. Ross et al., "Fast Computation of Sparse Datacubes," Proc. Int'l Conf. Very Large Data Bases (VLDB), 1997.
[28] M. Sharifzadeh and C. Shahabi, "The Spatial Skyline Queries," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2006.
[29] Y. Sismanis et al., "Dwarf: Shrinking the PetaCube," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2002.
[30] K.-L. Tan, P.-K. Eng, and B.C. Ooi, "Efficient Progressive Skyline Computation," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2001.
[31] Y. Tao et al., "Efficient Skyline and Top-k Retrieval in Subspaces," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 8, pp. 1072-1088, Aug. 2007.
[32] Y. Tao, X. Xiao, and J. Pei, "SUBSKY: Efficient Computation of Skylines in Subspaces," Proc. Int'l Conf. Data Eng. (ICDE), 2006.
[33] A. Vlachou et al., "SKYPEER: Efficient Subspace Skyline Computation over Distributed Data," Proc. Int'l Conf. Data Eng. (ICDE), pp. 416-425, 2007.
[34] T. Xia et al., "Refreshing the Sky: The Compressed Skycube with Efficient Support for Frequent Updates," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 491-502, 2006.
[35] D. Xin et al., "P-Cube: Answering Preference Queries in Multi-Dimensional Space," Proc. Int'l Conf. Data Eng. (ICDE), 2008.
[36] Y. Yuan et al., "Efficient Computation of the Skyline Cube," Proc. Int'l Conf. Very Large Data Bases (VLDB), 2005.
[37] S. Zhang et al., "Scalable Skyline Computation Using Object-Based Space Partitioning," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2009.
[38] Z. Zhang et al., "Kernel-Based Skyline Cardinality Estimation," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2009.
6 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool