IEEE Transactions on Parallel and Distributed Systems

IEEE Transactions on Parallel and Distributed Systems (TPDS) is a scholarly archival journal published monthly. Parallelism and distributed computing are foundational research and technology to rapidly advance computer systems and their applications. Read the full scope of TPDS

IEEE Transactions on Parallel and Distributed Systems (TPDS) has moved to the OnlinePlus publication model.

From the September 2015 Issue

LIBRA: Lightweight Data Skew Mitigation in MapReduce

By Qi Chen, Jinyu Yao, and Zhen Xiao

Free Featured ArticleMapReduce is an effective tool for parallel data processing. One significant issue in practical MapReduce applications is data skew: the imbalance in the amount of data assigned to each task. This causes some tasks to take much longer to finish than others and can significantly impact performance. This paper presents LIBRA, a lightweight strategy to address the data skew problem among the reducers of MapReduce applications. Unlike previous work, LIBRA does not require any pre-run sampling of the input data or prevent the overlap between the map and the reduce stages. It uses an innovative sampling method which can achieve a highly accurate approximation to the distribution of the intermediate data by sampling only a small fraction of the intermediate data during the normal map processing. It allows the reduce tasks to start copying as soon as the chosen sample map tasks (only a small fraction of map tasks which are issued first) complete. It supports the split of large keys when application semantics permit and the total order of the output data. It considers the heterogeneity of the computing resources when balancing the load among the reduce tasks appropriately. LIBRA is applicable to a wide range of applications and is transparent to the users. We implement LIBRA in Hadoop and our experiments show that LIBRA has negligible overhead and can speed up the execution of some popular applications by up to a factor of 4.

download PDF View the PDF of this article      csdl View this issue in the digital library

Editorials and Announcements


  • According to Thomson Reuters' 2013 Journal Citation Report, TPDS has an impact factor of 2.173.

  • TPDS celebrates its 25th Anniversary. Editor-in-Chief David A. Bader says, "Congratulations to TPDS on its Silver Jubilee! For 25 years, TPDS has been the parallel and distributed computing community's flagship journal for research breakthroughs!"

  • Get Your Journals as eBooks for Free


Guest Editorials

Reviewers List

Annual Index

Access recently published TPDS articles

RSS Subscribe to the RSS feed of latest TPDS content added to the digital library.

Mail Sign up for the Transactions Connection newsletter.

Listen to the OnlinePlus Podcast: Computer Society Publishing—two more titles migrate to OnlinePlus™ in 2012.

In this podcast, VP of Publications, David Alan Grier talks about Transactions on Mobile Computing and Transactions on Parallel and Distributed Systems migrating to OnlinePlus™.

TPDS is indexed in ISI