The Community for Technology Leaders
Green Image
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down versus bottom-up. The former, represented by the MultiWay Array Cube (called the MultiWay) algorithm [30], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of a priori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure values satisfy a threshold, called the iceberg condition). The latter, represented by BUC [6] , computes the iceberg cube bottom-up and facilitates a priori pruning. BUC explores fast sorting and partitioning techniques; however, it does not fully explore multidimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous two algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms the previous methods.
Data warehouse, data mining, online analytical processing (OLAP).
Benjamin W. Wah, Xiaolei Li, Zheng Shao, Jiawei Han, Dong Xin, "Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach", IEEE Transactions on Knowledge & Data Engineering, vol. 19, no. , pp. 111-126, January 2007, doi:10.1109/TKDE.2007.4
84 ms
(Ver )