This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Dynamically Discovering Likely Program Invariants to Support Program Evolution
February 2001 (vol. 27 no. 2)
pp. 99-123

Abstract—Explicitly stated program invariants can help programmers by identifying program properties that must be preserved when modifying code. In practice, however, these invariants are usually implicit. An alternative to expecting programmers to fully annotate code with invariants is to automatically infer likely invariants from the program itself. This research focuses on dynamic techniques for discovering invariants from execution traces. This article reports three results. First, it describes techniques for dynamically discovering invariants, along with an implementation, named Daikon, that embodies these techniques. Second, it reports on the application of Daikon to two sets of target programs. In programs from Gries's work on program derivation, the system rediscovered predefined invariants. In a C program lacking explicit invariants, the system discovered invariants that assisted a software evolution task. These experiments demonstrate that, at least for small programs, invariant inference is both accurate and useful. Third, it analyzes scalability issues, such as invariant detection runtime and accuracy, as functions of test suites and program points instrumented.

[1] D. Abramson, I. Foster, J. Michalakes, and R. Socic, “Relative Debugging: A New Methodology for Debugging Scientific Applications,” Comm. ACM, vol. 39, no. 11, pp. 69–77, Nov. 1996.
[2] J.H. Andrews, “Testing Using Log File Analysis: Tools, Methods and Issues,” Proc. 13th Ann. Int'l Conf. Automated Software Eng. (ASE '98), pp. 157–166, Oct. 1998.
[3] N. Bjørner, A. Browne, and Z. Manna, “Automatic Generation of Invariants and Intermediate Assertions,” Theoretical Computer Science, vol. 173, no. 1, pp. 49–87, Feb. 1997.
[4] I. Bratko and M. Grobelnik, “Inductive Learning Applied to Program Construction and Verification,” Knowledge Oriented Software Design: Extended Papers from the IFIP TC 12 Workshop Artificial Intelligence from the Information Processing Perspective, (AIFIPP '92), J. Cuena, ed., pp. 169–182, 1993.
[5] B. Boigelot and P. Godefroid, “Automatic Synthesis of Specifications from the Dynamic Observation of Reactive Programs,” Proc. Third Int'l Workshop Tools and Algorithms for the Construction and Analysis of Systems (TACAS '97), pp. 321–333, Apr. 1997.
[6] S. Bensalem, Y. Lakhnech, and H. Saidi, “Powerful Techniques for the Automatic Generation of Invariants,” Proc. Eighth Int'l Conf. Computer Aided Verification (CAV), pp. 323–335, July/Aug. 1996.
[7] M. Blum, “Designing Programs to Check Their Work,” Proc. Int'l Symp. Software Testing and Analysis, T. Ostrand and E. Weyuker, eds., p. 1, June 1993.
[8] E.C. Chan, J.T. Boyland, and W.L. Scherlis, “Promises: Limited Specifications for Analysis and Manipulation,” Proc. 20th Int'l Conf. Software Eng., pp. 167–176, Apr. 1998.
[9] P.M. Cousot and R. Cousot, “Automatic Synthesis of Optimal Invariant Assertions: Mathematical Foundations,” Proc. ACM Symp. Artificial Intelligence and Programming Languages, pp. 1–12, Aug. 1977.
[10] B. Calder, P. Feller, and A. Eustace, “Value Profiling,” Proc. 27th Ann. Int'l Symp. Microarchitecture (MICRO-97), pp. 259–269, Dec. 1997.
[11] B. Calder, P. Feller, and A. Eustace, “Value Profiling and Optimization,” J. Instruction Level Parallelism, vol. 1, Mar. 1999, http://www.cs.princeton.edu/sip/pub/sosp97.htmlhttp:/ /www.jilp.orgvol1/.
[12] B.H.C. Cheng and G.C. Gannod, “Abstraction of Formal Specifications from Program Code,” Proc. Third Int'l Conf. Tools for Artificial Intelligence (TAI '91), pp. 125–128, Nov. 1991.
[13] Watch What I Do: Programming by Demonstration, A. Cypher, D.C. Halbert, D. Kurlander, H. Lieberman, D. Maulsby, B.A. Myers, and A. Turransky eds., Cambridge, Mass.: MIT Press, 1993.
[14] W.W. Cohen, “Grammatically Biased Learning: Learning Logic Programs Using an Explicit Antecedent Description Language,” Artificial Intelligence, vol. 68, pp. 303–366, Aug. 1994.
[15] J.E. Cook and A.L. Wolf, “Discovering Models of Software Processes from Event-Based Data,” ACM Trans. Software Eng. and Methodology, vol. 7, no. 3, pp. 215–249, July 1998.
[16] J.E. Cook and A.L. Wolf, “Event-Based Detection of Concurrency,” Proc. ACM SIGSOFT '98 Symp. Foundations of Software Eng., pp. 35–45, Nov. 1998.
[17] E.M. Clarke and J.M. Wing, "Formal Methods: State of the Art and Future Directions," ACM Computing Surveys, vol. 28, no. 4, pp. 626-643, 1996.
[18] D. R. Chase, M. Wegman, and F. K. Zadeck,“Analysis of pointers and structures,”inProc. SIGPLAN '90 Conf. Programming Languages Design and Implementation,June 1990, pp. 296–310;SIGPLAN Notices, vol. 25, no. 6.
[19] D.D. Dunlop and V.R. Basili, “A Heuristic for Deriving Loop Functions,” IEEE Trans. Software Eng., vol. 10, no. 3, pp. 275–285, May 1984.
[20] M.B. Dwyer and L.A. Clarke, "Data Flow Analysis for Verifying Properties of Concurrent Programs," Software Eng. Notes, vol. 19, no. 5, pp. 62-75, Proc. ACM SIGSOFT Symp. Foundations of Software Eng., Dec. 1994.
[21] D.L. Detlefs, “An Overview of the Extended Static Checking System,” Proc. First Workshop Formal Methods in Software Practice, pp. 1–9, Jan. 1996.
[22] E.W. Dijkstra, A Discipline of Programming.Englewood Cliffs, N.J.: Prentice Hall, 1976.
[23] D.L. Detlefs, K. Rustan, M. Leino, G. Nelson, and J.B. Saxe, “Extended Static Checking,” SRC Research Report 159, Compaq Systems Research Center, Dec. 1998.
[24] D. Bruening, S. Devabhaktuni, and S. Amarasinghe, “Softspec: Software-Based Speculative Parallelism,” MIT/LCS Technical Memo, LCS-TM-606, Apr. 2000.
[25] G. Dromey, Program Derivation: The Development of Programs from Specifications, Addison-Wesley, 1989.
[26] M.D. Ernst, J. Cockrell, W.G. Griswold, and D. Notkin, “Dynamically Discovering Likely Program Invariants to Support Program Evolution,” Proc. 21st Int'l Conf. Software Eng., pp. 213–224, May 1999.
[27] M.D. Ernst, A. Czeisler, W.G. Griswold, and D. Notkin, “Quickly Detecting Relevant Program Invariants,” Proc. 22nd Int'l Conf. Software Eng., pp. 449-458, June 2000.
[28] Edison Design Group, C++ Front End Internal Documentation, version 2.28 ed., Mar. 1995, http:/www.edg.com.
[29] D. Evans, J. Guttag, J. Horning, and Y.M. Tan, “LCLint: A Tool for Using Specifications to Check Code,” Proc. Second ACM SIGSOFT Symp. the Foundations of Software Eng. (SIGSOFT '94), pp. 87–97, Dec. 1994.
[30] M.D. Ernst, W.G. Griswold, Y. Kataoka, and D. Notkin, “Dynamically Discovering Pointer-Based Program Invariants,” Technical Report UW-CSE-99-11-02, Univ. of Washington, Seattle, Wash., Nov. 1999.
[31] M.D. Ernst, “Dynamically Discovering Likely Program Invariants,” PhD thesis, Dept. of Computer Science and Eng., Univ. of Washington, Seattle, Wash., Aug. 2000.
[32] D. Evans, “Static Detection of Dynamic Memory Errors,” Proc. SIGPLAN '96 Conf. Programming Language Design and Implementation, pp. 44–53, May 1996.
[33] G.C. Gannod and B.H.C. Cheng, “Strongest Postcondition Semantics as the Formal Basis for Reverse Engineering,” J. Automated Software Eng., vol. 3, nos. 1-2, pp. 139–164, June 1996.
[34] G.C. Gannod and B.H.C. Cheng, “A Specification Matching Based Approach to Reverse Engineering,” Proc. 21st Int'l Conf. Software Eng., pp. 389–398, May 1999.
[35] M. Golan and D.R. Hanson, “DUEL—A Very High-Level Debugging Language,” Proc. 1993 USENIX Conf., pp. 107–117, Jan. 1993.
[36] R. Ghiya and L.J. Hendren, “Is It a Tree, a DAG, or a Cyclic Graph? A Shape Analysis for Heap-Directed Pointers in C,” Proc. 23rd Ann. ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages, pp. 1–15, Jan. 1996.
[37] R. Givan, “Inferring Program Specifications in Polynomial-Time,” Proc. Third Int'l Symp. Static Analysis (SAS '96), pp. 205–219, Sept. 1996.
[38] R.L. Givan Jr., “Automatically Inferring Properties of Computer Programs,” PhD thesis, Mass. Inst. of Technology, Cambridge, Mass., June 1996.
[39] D. Gries, The Science of Programming.New York, Heidelberg, Berlin: Springer-Verlag, 1981.
[40] R. Gupta, “A Fresh Look at Optimizing Array Bound Checking,” Proc. SIGPLAN '90 Conf. Programming Language Design and Implementation, pp. 272–282, June 1990.
[41] S.M. German and B. Wegbreit, “A Synthesizer of Inductive Assertions,” IEEE Trans. Software Eng., vol. 1, no. 1, pp. 68–75, Mar. 1975.
[42] D. Hamlet, “Random Testing,” Encyclopedia of Software Eng., 1994.
[43] M. Hutchins, H. Foster, T. Goradia, and T. Ostrand, “Experiments on the Effectiveness of Dataflow- and Controlflow-Based Test Adequacy Criteria,” Proc. Int'l Conf. Software Eng., pp. 191–200, May 1994.
[44] L.J. Hendren, J. Hummel, and A. Nicolau, “Abstractions for Recursive Pointer Data Structures: Improving the Analysis and Transformation of Imperative programs,” Proc. SIGPLAN '92 Conf. Programming Language Design and Implementation, pp. 249–260, June 1992.
[45] R. Hastings and B. Joyce, “Purify: A Tool for Detecting Memory Leaks and Access Errors in C and C++ Programs,” Proc. USENIX Conf., pp. 125–138, Jan. 1992.
[46] C.A.R. Hoare, "An Axiomatic Basis for Computer Programming," Comm. ACM, 1969.
[47] M.J. Harrold, G. Rothermel, R. Wu, and L. Yi, “An Empirical Investigation of Program Spectra,” ACM SIGPLAN/SIGSOFT Workshop Program Analysis for Software Tools and Eng. (PASTE '98), pp. 83–90, June 1998.
[48] A.S. Huang, G. Slavenburg, and J.P. Shen, “Speculative Disambiguation: A Compilation Technique for Dynamic Memory Disambiguation,” Proc. 21st Ann. Int'l Symp. Computer Architecture, pp. 200–210, Apr. 1994.
[49] R. Henry, K.M. Whaley, and B. Forstall, “The University of Washington Illustrating Compiler,” Proc. SIGPLAN '90 Conf. Programming Language Design and Implementation, pp. 223–246, June 1990.
[50] N.D. Jones, C. Gomard, and P. Sestoft, Partial Evaluation and Automatic Program Generation, Int'l Series in Computer Science, Englewood Cliffs, N.J.: Prentice Hall, June 1993.
[51] R. Jeffords and C. Heitmeyer, "Automatic Generation of State Invariants from Requirements Specifications," Proc. Sixth ACM SIGSOFT Symp. Foundations of Software Eng., Nov. 1998.
[52] R.W.M. Jones, “A Strategy for Finding the Optimal Data Placement for Regular Programs,” master's thesis, Dept. of Computing, Imperial College, 1996.
[53] B. Jacobs, J. van den Berg, M. Huisman, M. van Berkum, U. Hensel, and H. Tews, “Reasoning About Java Classes,” Object-Oriented Programming Systems, Languages, and Applications (OOPSLA '98), pp. 329–340, Oct. 1998.
[54] M. Karaorman, U. Holzle, and J. Bruno, “jContractor: A Reflective Java Library to Support Design by Contract,” Technical Report TRCS98-31, Univ. of Calif., Santa Barbara, Jan. 1999.
[55] S.C. Kleene, “Representation of Events in Nerve Nets and Finite Automata,” Automata Studies, Annals of Math. Studies 34, C.E. Shannon and J. McCarthy, eds., pp. 3–40, 1956.
[56] S. Katz and Z. Manna, “Logical Analysis of Programs,” Comm. ACM, vol. 19, no. 4, pp. 188–206, Apr. 1976.
[57] P. Kolte and M. Wolfe, “Elimination of Redundant Array Subscript Range Checks,” Proc. SIGPLAN '95 Conf. Programming Language Design and Implementation, pp. 270–278, June 1995.
[58] N.G. Leveson, S.S. Cha, and J.C. Knight, T.J. Shimeall, “The Use of Self Checks and Voting in Software Detection: An Empirical Study,” IEEE Trans. Software Eng., vol. 16, pp. 432-443, 1990.
[59] S.-W. Liao, A. Diwan, J. Robert, P. Bosch, A. Ghuloum, and M.S. Lam, “SUIF Explorer: an Interactive and Interprocedural Parallelizer,” Proc. Seventh ACM SIGPLAN, PPOPP, pp. 37–48, May 1999.
[60] B. Liskov and J. Guttag, Abstraction and Specification in Software Development, MIT Press, Cambridge, Mass., 1986.
[61] R. Lencevicius, U. Hölzle, and A.K. Singh, “Query-Based Debugging of Object-Oriented Programs,” Proc. Conf. Object-Oriented Programming, Systems, Languages, and Applications, pp. 304–317, Oct. 1997.
[62] K.R.M. Leino and G. Nelson, “An Extended Static Checker for Modula-3,” Proc. Compiler Construction: Seventh Int'l Conf. (CC '98), pp. 302–305, Apr. 1998.
[63] W. Landi and B.G. Ryder, "A Safe Approximate Algorithm for Pointer-Induced Aliasing," Proc. ACM SIGPLAN'92 Conf. Programming Language Design and Implementation (PLDI'92), pp. 235-248, 1992. published as SIGPLAN Notices, vol. 27, no. 7.
[64] T. Mitchell, Machine Learning, McGraw-Hill, 1997.
[65] G. Naumovich, L.A. Clarke, L.J. Osterweil, and M.B. Dwyer, “Verification of Concurrent Software with FLAVERS,” Proc. 19th Int'l Conf. Software Eng., pp. 594–595, May 1997.
[66] A. Nicolau,"Run-Time Disambiguation: Coping With Statically Unpredictable Dependencies," IEEE Trans. Computers, vol. 38, no. 5, pp. 663-678, May 1989.
[67] G.C. Necula and P. Lee, “The Design and Implementation of a Certifying Compiler,” Proc. ACM SIGPLAN '98 Conf. Programming Language Design and Implementation, pp. 333–344, June 1998.
[68] R. O'Callahan and D. Jackson, "Lackwit: A Program Understanding Tool Based on Type Inference," Proc. Int'l Conf. Software Engineering, IEEE Computer Soc. Press, Los Alamitos, Calif., 1997, pp. 338-348.
[69] F. Pfenning, “Dependent Types in Logic Programming,” Types in Logic Programming, F. Pfenning, ed., chapter 10, pp. 285–311, 1992.
[70] J.R. Quinlan, “Learning Logical Definitions from Relations,” Machine Learning, vol. 5, pp. 239–266, 1990.
[71] T. Reps, T. Ball, M. Das, and J. Larus, “The Use of Program Profiling for Software Maintenance with Applications to the Year 2000 Problem,” Proc. Sixth European Software Eng. Conf. and Fifth ACM SIGSOFT Symp. Foundations of Software Eng. (ESEC/FSE '97), pp. 432–449, Sept. 1997.
[72] G. Rothermel and M.J. Harrold, “Empirical Studies of a Safe Regression Test Selection Technique,” IEEE Trans. Software Eng., vol. 24, no. 6, pp. 401-419, June 1998.
[73] D.S. Rosenblum, “A Practical Approach to Programming With Assertions,” IEEE Trans. Software Eng., vol. 21, no. 1, pp. 19–31, Jan. 1995.
[74] B. Su, S. Habib, W. Zhao, J. Wang, and Y. Wu, “A Study of Pointer Aliasing for Software Pipelining Using Run-Time Disambiguation,” Proc. 27th Ann. Int'l Symp. Microarchitecture (MICRO-97), pp. 112–117, Nov./Dec. 1994.
[75] N. Suzuki and K. Ishihata, “Implementation of an Array Bound Checker,” Proc. Fourth Ann. ACM Symp. Principles of Programming Languages, pp. 132–143, Jan. 1977.
[76] L. Shu and M. Young, “A Mixed Locking/Abort Protocol for Hard Real-TimeSystems,” Proc. IEEE 11th Workshop Real-Time Operating Systems and Software, pp. 102-106, May 1994.
[77] A. Sodani and G.S. Sohi, “An Empirical Analysis of Instruction Repetition,” Proc. Eighth Int'l Symp. Architectural Support for Programming Languages and Operating Systems, pp. 35-45, Oct. 1998.
[78] Y.M. Tan, “Formal Specification Techniques for Promoting Software Modularity, Enhancing Documentation, and Testing Specifications,” Technical Report MIT/LCS/TR-619, Mass. Inst. of Technology, Laboratory for Computer Science, June 1994.
[79] G. van Rossum, Python Reference Manual, 1.5 ed., Dec. 1997.
[80] M. Vaziri and G. Holzmann, “Automatic Detection of Invariants in Spin,” SPIN 98: Papers from the Fourth Int'l SPIN Workshop, Nov. 1998.
[81] M.P. Ward, “Program Analysis by Formal Transformation,” The Computer J., vol. 39, no. 7, pp. 598–618, 1996.
[82] H. Wasserman and M. Blum, “Software Reliability via Run-Time Result-Checking,” J. ACM, vol. 44, no. 6, pp. 826-849, 1997.
[83] M. Ward, F.W. Calliss, and M. Munro, “The Maintainer's Assistant,” Proc. Int'l Conf. Software Maintenance, pp. 307–315, 1989.
[84] B. Wegbreit, “The Synthesis of Loop Predicates,” Comm. ACM, vol. 17, no. 2, pp. 102–112, Feb. 1974.
[85] R.P. Wilson and M.S. Lam, “Efficient Context-Sensitive Pointer Analysis for C Programs,” Proc. Conf. Programming Language Design and Implementation. pp. 1-12, 1995.
[86] H. Xi and F. Pfenning, “Eliminating Array Bound Checking Through Dependent Types,” Proc. ACM SIGPLAN '98 Conf. Programming Language Design and Implementation, pp. 249–257, June 1998.
[87] H. Xi and F. Pfenning, “Dependent Types in Practical Programming,” Proc. 26th Ann. ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages, pp. 214–227, Jan. 1999.
[88] C. Zenger, “Indexed Types,” Theoretical Computer Science, vol. 187, pp. 147–165, 1997.

Index Terms:
Program invariants, formal specification, software evolution, dynamic analysis, execution traces, logical inference, pattern recognition.
Citation:
Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin, "Dynamically Discovering Likely Program Invariants to Support Program Evolution," IEEE Transactions on Software Engineering, vol. 27, no. 2, pp. 99-123, Feb. 2001, doi:10.1109/32.908957
Usage of this product signifies your acceptance of the Terms of Use.