Issue No.04 - July-Aug. (2013 vol.30)
pp: 64-71
Jacek Czerwonka , Microsoft
Nachiappan Nagappan , Microsoft Research
Wolfram Schulte , Microsoft
Brendan Murphy , Microsoft Research
The scale and speed of today's software development efforts impose unprecedented constraints on the pace and quality of decisions made during planning, implementation, and postrelease maintenance and support for software. Decisions during the planning process include level of staffing and choosing a development model given the scope of a project and timelines. Tracking progress, course correcting, and identifying and mitigating risks are key in the development phase, as are monitoring aspects of and improving overall customer satisfaction in the maintenance and support phase. Availability of relevant data can greatly increase both the speed and likelihood of making a decision that leads to a successful software system. This article outlines the process Microsoft has gone through developing CODEMINE--a software development data analytics platform for collecting and analyzing engineering process data—its constraints, and pivotal organizational and technical choices.
Software development, Analytical models, Software quality, Data models, Data analysis, Computer bugs, Computer architecture, Software architecture, and software analytics, code quality, metrics, mining, reliability, software repositories
Jacek Czerwonka, Nachiappan Nagappan, Wolfram Schulte, Brendan Murphy, "CODEMINE: Building a Software Development Data Analytics Platform at Microsoft", IEEE Software, vol.30, no. 4, pp. 64-71, July-Aug. 2013, doi:10.1109/MS.2013.68
1. N. Nagappan and T. Ball, “Use of Relative Code Churn Measures to Predict System Defect Density,” Proc. 27th Int'l Conf. Software Eng. (ICSE 05), ACM, 2005, pp. 284–292.
2. J. Czerwonka et al., “CRANE: Failure Prediction, Change Analysis and Test Prioritization in Practice—Experiences from Windows,” Proc. 4th Int'l Conf. Software Testing, Verification and Validation (ICST 11), IEEE CS, 2011, pp. 357–366.
3. C. Bird and T. Zimmermann, “Assessing the Value of Branches with What-If Analysis,” Proc. ACM SIGSOFT 20th Int'l Symp. Foundations of Software Eng. (FSE 12), ACM, 2012, pp. 45–54.
4. C. Bird et al., “Putting It All Together: Using Socio-technical Networks to Predict Failures,” Proc. 20th Int'l Symp. Software Reliability Eng. (ISSRE 09), IEEE CS, 2009, pp. 109–119.
5. B. Ashok et al., “DebugAdvisor: A Recommender System for Debugging,” Proc. 7th Joint Meeting European Software Eng. Conf. and ACM SIGSSOFT Symp. Foundations of Software Eng. (ESEC/FSE 09), ACM, 2009, pp. 373–382.
6. A. Mockus, N. Nagappan, and T. T. Dinh-Trong, “Test Coverage and Post-verification Defects: A Multiple Case Study,” Proc. 3rd Int'l Symp. Empirical Software Eng. and Measurement (ESEM 09), IEEE CS, 2009, pp. 291–301.
7. L. Williams, G. Kudrjavets, and N. Nagappan, “On the Effectiveness of Unit Test Automation at Microsoft,” Proc. 20th Int'l Symp. Software Reliability Eng. (ISSRE 09), IEEE CS, 2009, pp. 81–89.
8. E. Shihab, C. Bird, and T. Zimmermann, “The Effect of Branching Strategies on Software Quality,” Proc. Int'l Symp. Empirical Software Eng. and Measurement (ESEM 12), ACM, 2012, pp. 301–310.