|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Qi Chen, Cheng Liu, Zhen Xiao, "Improving MapReduce Performance Using Smart Speculative Execution Strategy," IEEE Transactions on Computers, vol. 99, no. 1, pp. 1, , 5555. | |||
| BibTex | x | ||
| @article{ 10.1109/TC.2013.15, author = {Qi Chen and Cheng Liu and Zhen Xiao}, title = {Improving MapReduce Performance Using Smart Speculative Execution Strategy}, journal ={IEEE Transactions on Computers}, volume = {99}, number = {1}, issn = {0018-9340}, year = {5555}, pages = {1}, doi = {http://doi.ieeecomputersociety.org/10.1109/TC.2013.15}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - Improving MapReduce Performance Using Smart Speculative Execution Strategy IS - 1 SN - 0018-9340 SP EP EPD - 1 A1 - Qi Chen, A1 - Cheng Liu, A1 - Zhen Xiao, PY - 5555 KW - Optimization KW - Silicon KW - Time factors KW - Algorithm design and analysis KW - Redundancy KW - Real-time systems KW - Indexes KW - speculative execution KW - parallel computing KW - MapReduce VL - 99 JA - IEEE Transactions on Computers ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TC.2013.15
MapReduce is a widely used parallel computing framework for large scale data processing. The performance of MapReduce is seriously impacted by stragglers – machines on which tasks take an unusually long time to finish. Speculative execution is a common approach for dealing with this problem by backing up slow tasks on alternative machines. Existing strategies have some pitfalls: i) Identify slow tasks by average progress rate while actually progress rate can be unstable, ii) Care less about data locality when choosing backup nodes. In this paper, we first provide a detailed analysis of pitfalls in existing strategies. Then we develop a new strategy named MCP (Maximum Cost Performance), which improves the effectiveness of speculative execution significantly. MCP provides the following methods: i) Use EWMA to predict process speed and calculate task’s remaining time, ii) Determine which task to backup based on the load of cluster using a cost-benefit model, iii) To choose proper node for backups, we take both data locality and data skew into consideration. We evaluate MCP in a cluster of 101 virtual machines with several applications. Experiment results show that MCP can run job up to 39% faster and improve the cluster throughput up to 44% compared to Hadoop-0.21.
Index Terms:
Optimization,Silicon,Time factors,Algorithm design and analysis,Redundancy,Real-time systems,Indexes,speculative execution,parallel computing,MapReduce
Citation:
Qi Chen, Cheng Liu, Zhen Xiao, "Improving MapReduce Performance Using Smart Speculative Execution Strategy," IEEE Transactions on Computers, 28 Jan. 2013. IEEE computer Society Digital Library. IEEE Computer Society, <http://doi.ieeecomputersociety.org/10.1109/TC.2013.15>
Usage of this product signifies your acceptance of the Terms of Use.

