The Community for Technology Leaders
2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
ISBN: 978-1-4244-6522-4
pp: 41-51
Shengsheng Huang , Intel China Software Center, Shanghai, P.R. China, 200241
Jie Huang , Intel China Software Center, Shanghai, P.R. China, 200241
Jinquan Dai , Intel China Software Center, Shanghai, P.R. China, 200241
Tao Xie , Intel China Software Center, Shanghai, P.R. China, 200241
Bo Huang , Intel China Software Center, Shanghai, P.R. China, 200241
ABSTRACT
The MapReduce model is becoming prominent for the large-scale data analysis in the cloud. In this paper, we present the benchmarking, evaluation and characterization of Hadoop, an open-source implementation of MapReduce. We first introduce HiBench, a new benchmark suite for Hadoop. It consists of a set of Hadoop programs, including both synthetic micro-benchmarks and real-world Hadoop applications. We then evaluate and characterize the Hadoop framework using HiBench, in terms of speed (i.e., job running time), throughput (i.e., the number of tasks completed per minute), HDFS bandwidth, system resource (e.g., CPU, memory and I/O) utilizations, and data access patterns.
INDEX TERMS
CITATION

B. Huang, S. Huang, J. Dai, J. Huang and T. Xie, "The HiBench benchmark suite: Characterization of the MapReduce-based data analysis," 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)(ICDEW), Long Beach, CA, USA, 2010, pp. 41-51.
doi:10.1109/ICDEW.2010.5452747
87 ms
(Ver 3.3 (11022016))