The Community for Technology Leaders
RSS Icon
Issue No.04 - July/August (2009 vol.35)
pp: 566-572
Les Hatton , CISM, University, United Kingdom
This paper begins by modeling general software systems using concepts from statistical mechanics which provide a framework for linking microscopic and macroscopic features of any complex system. This analysis provides a way of linking two features of particular interest in software systems: first the microscopic distribution of defects within components and second the macroscopic distribution of component sizes in a typical system. The former has been studied extensively, but the latter much less so. This paper shows that subject to an external constraint that the total number of defects is fixed in an equilibrium system, commonly used defect models for individual components directly imply that the distribution of component sizes in such a system will obey a power-law Pareto distribution. The paper continues by analyzing a large number of mature systems of different total sizes, different implementation languages, and very different application areas, and demonstrates that the component sizes do indeed appear to obey the predicted power-law distribution. Some possible implications of this are explored.
Defects, macroscopic system behavior, component size distribution, Pareto.
Les Hatton, "Power-Law Distributions of Component Size in General Software Systems", IEEE Transactions on Software Engineering, vol.35, no. 4, pp. 566-572, July/August 2009, doi:10.1109/TSE.2008.105
[1] F. Akiyama, “An Example of Software System Debugging,” Information Processing, vol. 71, pp. 353-379, 1971.
[2] G. Baxter, M. Frean, J. Noble, M. Rickerby, H. Smith, M. Visser, H. Melton, and E. Tempero, “Understanding the Shape of Java Software,” Proc. Int'l Conf. Object Oriented Programming, Systems, Languages and Applications, 1167507 , 2006.
[3] D. Clark and C. Green, “An Empirical Study of List Structures in Lisp,” Comm. ACM, vol. 20, no. 2, pp. 78-87, 1977.
[4] B.T. Compton and C. Withrow, “Prediction and Control of Ada Software Defects,” J. Systems and Software, vol. 12, pp. 199-207, 1990.
[5] R.P. Feynman, Lectures on Computation. Penguin, 1996.
[6] J.R. Gaffney, “Estimating the Number of Faults in Code,” IEEE Trans. Software Eng., vol. 10, no. 4, 1984.
[7] L. Hatton, Safer C: Developing Software in High-Integrity and Safety-Critical Systems. McGraw-Hill, 1995.
[8] L. Hatton, “Re-Examining the Fault Density v. Component Size Connection,” IEEE Software, vol. 14, no. 2, pp. 89-98, Mar./Apr. 1997.
[9] T.R. Hopkins and L. Hatton, “Defect Correlations in a Major Numerical Library,” http://www.leshatton.orgNAG01_01-08.html (preprint), 2008.
[10] M. Lipow, “Number of Faults Per Line of Code,” IEEE Trans. Software Eng., vol. 8, no. 4, pp. 437-439, July 1982.
[11] A. Potanin, J. Noble, M. Frean, and R. Biddle, “Scale-Free Geometry in oo Programs,” Comm. ACM, vol. 48, no. 5, pp. 99-103, May 2005.
[12] P.K. Rawlings, D. Reguera, and H. Reiss, “Entropic Basis of the Pareto Law,” Physica A, vol. 343, pp. 643-652, July 2004.
[13] A. Sommerfeld, Thermodynamics and Statistical Mechanics. Academic Press, 1956.
[14] M.R. Spiegel and L.J. Stephens, Statistics, third ed. McGraw-Hill, 1999.
[15] G.K. Zipf, Psycho-Biology of Languages. Houghton-Miflin, 1935.
16 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool