2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE) (2017)
Feb. 21, 2017 to Feb. 21, 2017
Timothy Chappelly , Queensland University of Technology, Brisbane, Australia
Cristina Cifuentes , Oracle Labs, Brisbane, Australia
Padmanabhan Krishnan , Oracle Labs, Brisbane, Australia
Shlomo Gevay , Queensland University of Technology, Brisbane, Australia
Static program analysis is a technique to analyse code without executing it, and can be used to find bugs in source code. Many open source and commercial tools have been developed in this space over the past 20 years. Scalability and precision are of importance for the deployment of static code analysis tools - numerous false positives and slow runtime both make the tool hard to be used by development, where integration into a nightly build is the standard goal. This requires one to identify a suitable abstraction for the static analysis which is typically a manual process and can be expensive. In this paper we report our findings on using machine learning techniques to detect defects in C programs. We use three offthe- shelf machine learning techniques and use a large corpus of programs available for use in both the training and evaluation of the results. We compare the results produced by the machine learning technique against the Parfait static program analysis tool used internally at Oracle by thousands of developers. While on the surface the initial results were encouraging, further investigation suggests that the machine learning techniques we used are not suitable replacements for static program analysis tools due to low precision of the results. This could be due to a variety of reasons including not using domain knowledge such as the semantics of the programming language and lack of suitable data used in the training process.
Computer bugs, Training, Benchmark testing, Complexity theory, Feature extraction, Data models
T. Chappelly, C. Cifuentes, P. Krishnan and S. Gevay, "Machine learning for finding bugs: An initial report," 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)(MALTESQUE), Klagenfurt, Austria, 2017, pp. 21-26.