Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'"
Issue No. 09 - September (2007 vol. 33)
Tim Menzies , IEEE
Zhang & Zhang (hereafter, the Zhangs) argue that such the low precision detectors seen in Menzies, Greenwald, and Frank's paper Data Mining Static Code Attributes to Learn Defect Predictors  (hereafter, DMP) are "not satisfactory for practical purposes". They demand that "a good prediction model should achieve both high Recall and high Precision" (which we will denote as "high precision & recall"). All other detectors, they argue, "may lead to impractical prediction models". We have a different view and this short note explains why. While we disagree with the Zhangs' conclusions, we find that their derived equation is an important result. The insightful feature of the Zhangs' equation is that it can use information about the problem at hand to characterize the pre-conditions for high precision and high recall detectors. To the best of our knowledge, no such characterization has been previously reported (at least, not in the software engineering literature).
Tim Menzies, Justin Distefano, Jeremy Greenwald, Alex Dekhtyar, "Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'"", IEEE Transactions on Software Engineering, vol. 33, no. , pp. 637-640, September 2007, doi:10.1109/TSE.2007.70721