The Community for Technology Leaders
2015 International Conference on Big Data and Smart Computing (BigComp) (2015)
Jeju, South Korea
Feb. 9, 2015 to Feb. 11, 2015
ISBN: 978-1-4799-7303-3
pp: 95-102
Suan Lee , Department of Computer Science, Kangwon National University, 192-1 Hyoja-dong, Chuncheon, Kangwon, Korea
Sunhwa Jo , Department of Computer Science, Kangwon National University, 192-1 Hyoja-dong, Chuncheon, Kangwon, Korea
Jinho Kim , Department of Computer Science, Kangwon National University, 192-1 Hyoja-dong, Chuncheon, Kangwon, Korea
ABSTRACT
Data cube is used as an OLAP (On-Line Analytical Processing) model to implement multidimensional analyses in many fields of application. Computing a data cube requires a long sequence of basic operations and storage costs. Exponentially accumulating amounts of data have reached a magnitude that overwhelms the processing capacities of single computers. In this paper, we implement a large-scale data cube computation based on distributed parallel computing using the MapReduce (MR) computational framework. For this purpose, we developed a new algorithm, MRDataCube, which incorporates the MR mechanism into data cube computations such that effective data cube computations are enabled even when using the same computing resources. The proposed MRDataCube consists of two-level MR phases, namely, MRSpread and MRAssemble. The main feature of this algorithm is a continuous data reduction through the combination of partial cuboids and partial cells that are emitted when the computation undergoes these two phases. From the experimental results we revealed that MRDataCube outperforms all other algorithms.
INDEX TERMS
Aggregates, Distributed databases, Manganese, Parallel processing, Data models, Arrays
CITATION

S. Lee, S. Jo and J. Kim, "MRDataCube: Data cube computation using MapReduce," 2015 International Conference on Big Data and Smart Computing (BigComp)(BIGCOMP), Jeju, South Korea, 2015, pp. 95-102.
doi:10.1109/35021BIGCOMP.2015.7072817
88 ms
(Ver 3.3 (11022016))