Approximation of COSMIC Functional Size of Scenario-Based Requirements in Agile Based on Syntactic Linguistic Features—A Replication Study
2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement (2016)
Oct. 5, 2016 to Oct. 7, 2016
Context: Expert judgment is the most frequently used method of effort estimation in Agile software development. Unfortunately, Agile teams often underestimate development effort. Therefore, it seems beneficial to support such teams with the information regarding the functional size of requirements they are estimating. Hussain, Kosseim and Ormandjieva (HKO) proposed a method that can be used to automatically classify textual requirements with respect to their COSMIC functional size. Unfortunately, the method has not been sufficiently validated to confirm its usefulness. Objective: To provide external validation of the HKO method and investigate if it can be applied to classify scenario-based requirements (in the form of use cases) with respect to their COSMIC size. Method: Similarily to the original study, we used a set of natural language processing tools to extract syntactic linguistic features and the C4.5 decision tree-based classifiers to classify requirements. We validated the performance of the classifiers using the 10-fold cross-validation procedure on a dataset containing 93 use cases. We compared the performance of the HKO method with the performance of the classifiers trained using a single prediction feature—the number of steps in a use case. Results: Depending on the considered number of size classes and the algorithm used to compute boundaries of the classes, the accuracy of the HKO method ranged between .387 and .785 while the Cohen's kappa index was between .194 and .577. The accuracy of the use-case-steps-based classifiers performed slightly worse. Their accuracy ranged between .015 and .769 while Cohen's kappa was between .067 and .423. We observed that the performance of both types of classifiers dropped visibly when applied to four or more size classes. Conclusion: The classification performance of the HKO method was moderate. However, it was still better than the classification based on the number of steps. Unfortunately, we also observed that the accuracy of the HKO method is sensitive to the language used in descriptions of requirements.
Software, Size measurement, Syntactics, Pragmatics, Software measurement, Estimation, Planning
M. Ochodek, "Approximation of COSMIC Functional Size of Scenario-Based Requirements in Agile Based on Syntactic Linguistic Features—A Replication Study," 2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement(IWSM Mensura), Berlin, Germany, 2016, pp. 201-211.