Issue No. 01 - January (2007 vol. 19)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2007.4
Dong Xin , IEEE
Jiawei Han , IEEE
Benjamin W. Wah , IEEE
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down versus bottom-up. The former, represented by the MultiWay Array Cube (called the MultiWay) algorithm , aggregates simultaneously on multiple dimensions; however, it cannot take advantage of a priori pruning  when computing iceberg cubes (cubes that contain only aggregate cells whose measure values satisfy a threshold, called the iceberg condition). The latter, represented by BUC  , computes the iceberg cube bottom-up and facilitates a priori pruning. BUC explores fast sorting and partitioning techniques; however, it does not fully explore multidimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous two algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms the previous methods.
Data warehouse, data mining, online analytical processing (OLAP).
B. W. Wah, X. Li, Z. Shao, J. Han and D. Xin, "Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach," in IEEE Transactions on Knowledge & Data Engineering, vol. 19, no. , pp. 111-126, 2007.