• Publication
  • PrePrints
  • Abstract - Improving MapReduce Performance Using Smart Speculative Execution Strategy
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Improving MapReduce Performance Using Smart Speculative Execution Strategy
PrePrint
ISSN: 0018-9340
Qi Chen, Peking University, Beijing
Cheng Liu, Peking University, Beijing
Zhen Xiao, Peking University, Beijing
MapReduce is a widely used parallel computing framework for large scale data processing. The performance of MapReduce is seriously impacted by stragglers – machines on which tasks take an unusually long time to finish. Speculative execution is a common approach for dealing with this problem by backing up slow tasks on alternative machines. Existing strategies have some pitfalls: i) Identify slow tasks by average progress rate while actually progress rate can be unstable, ii) Care less about data locality when choosing backup nodes. In this paper, we first provide a detailed analysis of pitfalls in existing strategies. Then we develop a new strategy named MCP (Maximum Cost Performance), which improves the effectiveness of speculative execution significantly. MCP provides the following methods: i) Use EWMA to predict process speed and calculate task’s remaining time, ii) Determine which task to backup based on the load of cluster using a cost-benefit model, iii) To choose proper node for backups, we take both data locality and data skew into consideration. We evaluate MCP in a cluster of 101 virtual machines with several applications. Experiment results show that MCP can run job up to 39% faster and improve the cluster throughput up to 44% compared to Hadoop-0.21.
Index Terms:
Optimization,Silicon,Time factors,Algorithm design and analysis,Redundancy,Real-time systems,Indexes,speculative execution,parallel computing,MapReduce
Citation:
Qi Chen, Cheng Liu, Zhen Xiao, "Improving MapReduce Performance Using Smart Speculative Execution Strategy," IEEE Transactions on Computers, 28 Jan. 2013. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TC.2013.15>
Usage of this product signifies your acceptance of the Terms of Use.