Frontiers of Information Technology (2012)
Islamabad, Pakistan Pakistan
Dec. 17, 2012 to Dec. 19, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FIT.2012.40
Similar software have similar software measurements. Defect data from one software can be used to anticipate defects in a similar software. Although, not many defect datasets are made public in software engineering domain, PROMISE repository is a reasonable collection of software data. This paper presents a two step approach to identify similar software and applies the proposed technique to find similar datasets in PROMISE repository. As step 1, the approach generates associations rules for each dataset to determine dataset's behavior in terms of frequent patterns. As step 2, overlap between the association rules is calculated using Fuzzy Inference Systems (FIS). The FIS generated for the study have been expert-based as well as auto-generated. Similarity between 28 dataset pairs has been found KC2 and PC1 turned out to be most similar datasets with 86% similarity using Mamdani, 92% with Sugeno models. Results from expert-based and auto generated FIS have been comparable.
association rules, software similarity, dataset similarity, software measures, FIS, fuzzy system
Saba Anwar, Zeeshan A. Rana, Mian M. Awais, "Identifying Similar Software Datasets through Fuzzy Inference System", Frontiers of Information Technology, vol. 00, no. , pp. 181-187, 2012, doi:10.1109/FIT.2012.40