The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2013 vol.39)
pp: 725-741
Nasir Ali , École Polytechnique de Montréal, Montréal
Yann-Gaël Guéhéneuc , École Polytechnique de Montréal, Montréal
Giuliano Antoniol , École Polytechnique de Montréal, Montréal
ABSTRACT
Traceability is the only means to ensure that the source code of a system is consistent with its requirements and that all and only the specified requirements have been implemented by developers. During software maintenance and evolution, requirement traceability links become obsolete because developers do not/cannot devote effort to updating them. Yet, recovering these traceability links later is a daunting and costly task for developers. Consequently, the literature has proposed methods, techniques, and tools to recover these traceability links semi-automatically or automatically. Among the proposed techniques, the literature showed that information retrieval (IR) techniques can automatically recover traceability links between free-text requirements and source code. However, IR techniques lack accuracy (precision and recall). In this paper, we show that mining software repositories and combining mined results with IR techniques can improve the accuracy (precision and recall) of IR techniques and we propose Trustrace, a trust--based traceability recovery approach. We apply Trustrace on four medium-size open-source systems to compare the accuracy of its traceability links with those recovered using state-of-the-art IR techniques from the literature, based on the Vector Space Model and Jensen-Shannon model. The results of Trustrace are up to 22.7 percent more precise and have 7.66 percent better recall values than those of the other techniques, on average. We thus show that mining software repositories and combining the mined data with existing results from IR techniques improves the precision and recall of requirement traceability links.
INDEX TERMS
Accuracy, Data mining, Software maintenance, Information retrieval, Open source software, Principal component analysis, trust-based model, Traceability, requirements, feature, source code, repositories, experts
CITATION
Nasir Ali, Yann-Gaël Guéhéneuc, Giuliano Antoniol, "Trustrace: Mining Software Repositories to Improve the Accuracy of Requirement Traceability Links", IEEE Transactions on Software Engineering, vol.39, no. 5, pp. 725-741, May 2013, doi:10.1109/TSE.2012.71
REFERENCES
[1] O.C.Z. Gotel and C.W. Finkelstein, "An Analysis of the Requirements Traceability Problem," Proc. First Int'l Conf. Requirements Eng., pp. 94-101, Apr. 1994.
[2] N. Ali, Y.-G. Guéhéneuc, and G. Antoniol, "Trust-Based Requirements Traceability," Proc. 19th IEEE Int'l Conf. Program Comprehension, S.E. Sim and F. Ricca, eds., pp. 111-120, June 2011.
[3] G. Antoniol, G. Canfora, G. Casazza, A.D. Lucia, and E. Merlo, "Recovering Traceability Links between Code and Documentation," IEEE Trans. Software Eng., vol. 28, no. 10, pp. 970-983, Oct. 2002.
[4] A. Marcus and J.I. Maletic, "Recovering Documentation-to-Source-Code Traceability Links Using Latent Semantic Indexing," Proc. 25th Int'l Conf. Software Eng., pp. 125-135, 2003.
[5] J.H. Hayes, A. Dekhtyar, S.K. Sundaram, and S. Howard, "Helping Analysts Trace Requirements: An Objective Look," Proc. 12th IEEE Int'l Requirements Eng. Conf., pp. 249-259, 2004.
[6] J.I. Maletic and M.L. Collard, "TQL: A Query Language to Support Traceability," Proc. ICSE Workshop Traceability in Emerging Forms of Software Eng., pp. 16-20, 2009.
[7] J.H. Hayes, G. Antoniol, and Y.-G. Guéhéneuc, "PREREQIR: Recovering Pre-Requirements via Cluster Analysis," Proc. 15th Working Conf. Reverse Eng., pp. 165-174, Oct. 2008.
[8] D. Poshyvanyk, Y.-G. Guéhéneuc, A. Marcus, G. Antoniol, and V. Rajlich, "Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval," IEEE Trans. Software Eng., vol. 33, no. 6, pp. 420-432, June 2007.
[9] M. Gethers, R. Oliveto, D. Poshyvanyk, and A.D. Lucia, "On Integrating Orthogonal Information Retrieval Methods to Improve Traceability Recovery," Proc. 27th IEEE Int'l Conf. Software Maintenance, pp. 133-142, Sept. 2011.
[10] N. Ali, Y.-G. Guéhéneuc, and G. Antoniol, Factors Impacting the Inputs of Traceability Recovery Approaches, A. Zisman, J. Cleland-Huang, and O. Gotel, eds. Springer-Verlag, 2011.
[11] R. Berg and J.M.L. Van, "Finding Symbolons for Cyberspace: Addressing the Issues of Trust in Electronic Commerce," Production Planning and Control, vol. 12, pp. 514-524, 2001.
[12] D.H. McKnight, V. Choudhury, and C.K. Kacmar, "The Impact of Initial Consumer Trust on Intentions to Transact with a Web Site: A Trust Building Model," The J. Strategic Information Systems, vol. 11, nos. 3/4, pp. 297-323, 2002.
[13] J.W. Palmer, J.P. Bailey, and S. Faraj, "The Role of Intermediaries in the Development of Trust on the WWW: The Use and Prominence of Trusted Third Parties and Privacy Statements," J. Computer-Mediated Comm., vol. 5, no. 3, 2000.
[14] M. Koufaris and W. Hampton-Sosa, "The Development of Initial Trust in an Online Company by New Customers," Information & Management, vol. 41, no. 3, pp. 377-397, 2004.
[15] A. Abadi, M. Nisenson, and Y. Simionovici, "A Traceability Technique for Specifications," Proc. 16th IEEE Int'l Conf. Program Comprehension, pp. 103-112, June 2008.
[16] A. Marcus and J.I. Maletic, "Recovering Documentation-to-Source-Code Traceability Links Using Latent Semantic Indexing," Proc. Int'l Conf. Software Eng., pp. 125-135, May 2003.
[17] H. Asuncion, A. Asuncion, and R. Taylor, "Software Traceability with Topic Modeling," Proc. 32nd ACM/IEEE Int'l Conf. Software Eng., vol. 1, pp. 95-104, 2010.
[18] M. Porter, "An Algorithm for Suffix Stripping," Program: Electronic Library and Information Systems, vol. 40, pp. 130-137, 1980.
[19] A. Bachmann, C. Bird, F. Rahman, P. Devanbu, and A. Bernstein, "The Missing Links: Bugs and Bug-Fix Commits," Proc. 18th ACM SIGSOFT Int'l Symp. Foundations of Software Eng., pp. 97-106, 2010.
[20] R. Wu, H. Zhang, S. Kim, and S. Cheung, "Relink: Recovering Links between Bugs and Changes," Proc. 19th ACM SIGSOFT Symp. and 13th European Conf. Foundations of Software Eng., pp. 15-25, 2011.
[21] W.B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures and Algorithms. Prentice-Hall, 1992.
[22] N. Ali, Y.-G. Guéhéneuc, and G. Antoniol, "Requirements Traceability for Object Oriented Systems by Partitioning Source Code," Proc. 18th Working Conf. Reverse Eng., pp. 45-54, Oct. 2011.
[23] M. Eaddy, T. Zimmermann, K.D. Sherwood, V. Garg, G.C. Murphy, N. Nagappan, and A.V. Aho, "Do Crosscutting Concerns Cause Defects?" IEEE Trans. Software Eng., vol. 34, no. 4, pp. 497-515, July/Aug. 2008.
[24] B. Dit, M. Revelle, M. Gethers, and D. Poshyvanyk, "Feature Location in Source Code: A Taxonomy and Survey," J. Software Maintenance and Evolution: Research and Practice, 2011.
[25] J.H. Hayes and A. Dekhtyar, "A Framework for Comparing Requirements Tracing Experiments," Int'l J. Software Eng. and Knowledge Eng., vol. 15, no. 5, pp. 751-782, 2005.
[26] J.H. Hayes, A. Dekhtyar, S.K. Sundaram, E.A. Holbrook, S. Vadlamudi, and A. April, "Requirements Tracing on Target (Retro): Improving Software Maintenance through Traceability Recovery," Innovations in Systems and Software Eng., vol. 3, pp. 193-202, 2007.
[27] A. De Lucia, M. Di Penta, R. Oliveto, A. Panichella, and S. Panichella, "Improving IR-Based Traceability Recovery Using Smoothing Filters," Proc. 19th IEEE Int'l Conf. Program Comprehension, pp. 21-30, June 2011.
[28] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison-Wesley, 1999.
[29] A.D. Lucia, F. Fasano, R. Oliveto, and G. Tortora, "Recovering Traceability Links in Software Artifact Management Systems Using Information Retrieval Methods," ACM Trans. Software Eng. Methodology, vol. 16, no. 4,article 13, 2007.
[30] S.A. Sherba and K.M. Anderson, "A Framework for Managing Traceability Relationships between Requirements and Architectures," Proc. Second Int'l Software Requirements to Architectures Workshop, part of Int'l Conf. Software Eng., pp. 150-156, 2003,
[31] P. Mader, O. Gotel, and I. Philippow, "Enabling Automated Traceability Maintenance by Recognizing Development Activities Applied to Models," Proc. 23rd IEEE/ACM Int'l Conf. Automated Software Eng., pp. 49-58, 2008.
[32] A.D. Lucia, M.D. Penta, and R. Oliveto, "Improving Source Code Lexicon via Traceability and Information Retrieval," IEEE Trans. Software Eng., vol. 37, no. 2, pp. 205-227, Mar. 2011.
[33] X. Zou, R. Settimi, and J. Cleland-Huang, "Improving Automated Requirements Trace Retrieval: A Study of Term-Based Enhancement Methods," Empirical Software Eng., vol. 15, no. 2, pp. 119-146, 2010.
[34] W. Zhao, L. Zhang, Y. Liu, J. Sun, and F. Yang, "SNIAFL: Towards a Static Noninteractive Approach to Feature Location," ACM Trans. Software Eng. Methodology, vol. 15, pp. 195-226, Apr. 2006.
[35] H. Kagdi, J. Maletic, and B. Sharif, "Mining Software Repositories for Traceability Links," Proc. 15th IEEE Int'l Conf. Program Comprehension, pp. 145-154, June 2007.
[36] H. Kagdi and J. Maletic, "Software Repositories: A Source for Traceability Links," Proc. Int'l Workshop Traceability in Emerging Forms of Software Eng., pp. 32 -39, Mar. 2007.
[37] H. Kagdi, S. Yusuf, and J.I. Maletic, "Mining Sequences of Changed-Files from Version Histories," Proc. Int'l Workshop Mining Software Repositories, pp. 47-53, 2006.
[38] D. Artza and Y. Gil, "A Survey of Trust in Computer Science and the Semantic Web," Web Semantics: Science, Services and Agents on the World Wide Web, vol. 5, no. 2, pp. 58-71, 2007.
[39] T. Grandison and M. Sloman, "A Survey of Trust in Internet Applications," IEEE Comm. Surveys Tutorials, vol. 3, no. 4, pp. 2-16, Fourth Quarter 2000.
[40] W. Wanga, G. Zenga, and D. Tang, "Using Evidence Based Content Trust Model for Spam Detection," Expert Systems with Applications, vol. 37, no. 8, pp. 5599-5606, 2010.
[41] L. Dapeng, M. Andrian, P. Denys, and R. Vaclav, "Feature Location via Information Retrieval Based Filtering of a Single Scenario Execution Trace," Proc. 22nd IEEE/ACM Int'l Conf. Automated Software Eng., pp. 234-243, 2007.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool