This Article 
 Bibliographic References 
 Add to: 
A Pattern Recognition Approach for Software Engineering Data Analysis
November 1992 (vol. 18 no. 11)
pp. 931-942

In order to plan, control, and evaluate the software development process, one needs to collect and analyze data in a meaningful way. Classical techniques for such analysis are not always well suited to software engineering data. A pattern recognition approach for analyzing software engineering data, called optimized set reduction (OSR), that addresses many of the problems associated with the usual approaches is described. Methods are discussed for using the technique for prediction, risk management, and quality evaluation. Experimental results are provided to demonstrate the effectiveness of the technique for the particular application of software cost estimation.

[1] A. Agresti,Categorical Data Analysis. New York: Wiley, 1990.
[2] V. Basili, "Quantitative evaluation of software methodology," inProc. First Pan Pacific Computer Conf., July 1985.
[3] V.R. Basili and H.D. Rombach, "The Tame Project: Towards Improvement-Oriented Software Environments,"IEEE Trans. Software Eng., Vol. SE-14, No. 6, June 1988, pp. 758-773.
[4] V. Basili and D. Weiss, "A methodology for collecting valid software engineering data,"IEEE Trans. Software Eng., Nov. 1984.
[5] B. W. Boehm,Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, 1981.
[6] L. Breiman, J. Friedman, R. Olshen, and C. Stone,Classification and Regression Trees Monterey, CA: Wadsworth&Brooks/Cole (advanced books and software), 1984.
[7] L. Briand, V. Basili, and C. Hetmanski, "Providing an empirical basis for optimizing the verification and testing phases of software development," presented at the IEEE Int. Symp. Software Reliability Engineering, Oct. 1992.
[8] L. Briand and A. Porter, "An alternative modeling approach for predicting error profiles in Ada systems," EUROMETRICS '92, European Conference on Quantitative Evaluation of Software and Systems, Apr. 1992.
[9] J. Capon,Elementary Statistics for the Social Sciences. Belmont, CA: Wadsworth, 1988.
[10] R. Charette,Software Engineering Risk Analysis and Management. New York: McGraw-Hill, 1989.
[11] W. R. Dillon,Multivariate Analysis: Methods and Applications. New York: Wiley, 1984.
[12] U. Fayyad and K. Irani, "On the handline of continuous-valued attributes in decision tree generation,"Machine Learning, vol. 8, pp. 87-102, Mar. 1992.
[13] C.F. Kemerer, "An empirical validation of software cost estimation models,"Commun. ACM, vol. 30, no. 5, pp. 416-429, May 1987.
[14] J. Mingers, "Empirical comparison of selection measures for decision tree induction,"Machine Learning, vol. 3, pp. 319-342, 1989.
[15] A. Porter and R. Selby, "Evaluating techniques for generating metric-based classification trees,"J. Syst. Software, vol. 12, pp. 209-218, July 1990.
[16] D. Potier, J. Albin, V. Ferreol, and A. Bilodeau, "Experiments with computer software complexity and reliability," inProc. 6th Int. Conf. on Software Eng., 1982, pp. 94-101.
[17] J. R. Quinlan, "Induction of decision trees,"Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.
[18] R. Selby and A. Porter, "Learning from examples: Generation and evaluation of decision trees for software resource analysis,"IEEE Trans. Software Eng., 1988.
[19] J. Tou and R. Gonzalez,Pattern Recognition Principles. Reading, MA: Addison-Wesley, 1974.

Index Terms:
pattern recognition; software engineering data analysis; optimized set reduction; prediction; risk management; quality evaluation; software cost estimation; pattern recognition; project management; software cost estimation; software quality
L.C. Briand, V.R. Basili, W.M. Thomas, "A Pattern Recognition Approach for Software Engineering Data Analysis," IEEE Transactions on Software Engineering, vol. 18, no. 11, pp. 931-942, Nov. 1992, doi:10.1109/32.177363
Usage of this product signifies your acceptance of the Terms of Use.