Who should review this change?: Putting text and file location analyses together for more accurate recommendations
2015 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2015)
Sept. 29, 2015 to Oct. 1, 2015
Xin Xia , College of Computer Science and Technology, Zhejiang University, Hangzhou, China
David Lo , School of Information Systems, Singapore Management University, Singapore
Xinyu Wang , College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Xiaohu Yang , College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Software code review is a process of developers inspecting new code changes made by others, to evaluate their quality and identify and fix defects, before integrating them to the main branch of a version control system. Modern Code Review (MCR), a lightweight and tool-based variant of conventional code review, is widely adopted in both open source and proprietary software projects. One challenge that impacts MCR is the assignment of appropriate developers to review a code change. Considering that there could be hundreds of potential code reviewers in a software project, picking suitable reviewers is not a straightforward task. A prior study by Thongtanunam et al. showed that the difficulty in selecting suitable reviewers may delay the review process by an average of 12 days. In this paper, to address the challenge of assigning suitable reviewers to changes, we propose a hybrid and incremental approach Tie which utilizes the advantages of both Text mIning and a filE location-based approach. To do this, Tie integrates an incremental text mining model which analyzes the textual contents in a review request, and a similarity model which measures the similarity of changed file paths and reviewed file paths. We perform a large-scale experiment on four open source projects, namely Android, OpenStack, QT, and LibreOffice, containing a total of 42,045 reviews. The experimental results show that on average Tie can achieve top-1, top-5, and top-10 accuracies, and Mean Reciprocal Rank (MRR) of 0.52, 0.79, 0.85, and 0.64 for the four projects, which improves the state-of-the-art approach RevFinder, proposed by Thongtanunam et al., by 61%, 23%, 8%, and 37%, respectively.
Text mining, Software, Computational modeling, Feature extraction, Analytical models, Accuracy, Control systems
X. Xia, D. Lo, X. Wang and X. Yang, "Who should review this change?: Putting text and file location analyses together for more accurate recommendations," 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), Bremen, Germany, 2015, pp. 261-270.