This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Evaluating Performance and Quality of Knowledge-Based Systems: Foundation and Methodology
April 1993 (vol. 5 no. 2)
pp. 204-224

A survey of knowledge-based system (KBS) evaluation methods is presented. The authors argue that these methods are partial, poorly systematic, and not easily applicable. An approach to KBS evaluation that comprises a precise definition of the concepts of performance and quality, a general evaluation methodology, and a set of criteria to support its practical application is presented. The proposed approach has been tried only partially and with rather simple test cases.

[1] J. Ackroff, P. Surko, G. Vesonder, and J. Wright, "SARTS: Auto test- 2," in M. Bramer, ed.,Practical Experience in Building Expert Systems. New York: Wiley, 1990.
[2] K.-P. Adlassnig, "The application of ROC curves to the evaluation of medical expert systems," inProc. 7th Int. Conf. European Federation for Medical Informatics, Rome, Italy, vol. 1, 1987, pp. 951-956.
[3] W.R. Adrion, M.A. Brandstad, and J.C. Cherniavsky, "Validation, verification, and testing of computer software,"Computing Surveys, vol. 14, no. 2, pp. 159-191, June 1982.
[4] M. Ayel, "Détection d'incohérences dans les bases de connaissances: SACCO," Thèse d'Etat, Univ. de Chambéry, Chambéry, France, 1987.
[5] A. Beauvieux, "Controler la cohérence d'une base de connaissances." inProc. 8th Int. Workshop Expert Systems and Their Applications, vol. 3, Avignon, France, 1988, pp. 35-48.
[6] D. C. Berry and A. E. Hart, "Evaluating expert systems,"Expert Syst., vol. 7, no. 4, pp. 199-208, 1990.
[7] D. C. Bochsler and M. A. Goodwin, "A software engineering approach to expert system design and verification," inProc. NASA/STID Conf. on Artificial Intelligence for Space Applications, Huntsville, AL, 1986, pp. 47-60.
[8] G. Born, "Applying quality assurance to expert systems," inProc. 5th Int. Expert Systems Conf., Paris, France, 1989.
[9] R. J. Brachman, "I lied about the trees' or, defaults and definitions in knowledge representations,"AI Mag.vol. 6, no. 3, pp. 80-93, 1985.
[10] B. G. Buchanan and E. H. Shortliffe, "The problem of evaluation," inRule-Based Expert Systems -- The MYCIN Experiments at the Stanford Heuristic Programming Project. Reading, MA: Addison-Wesley, 1984, pp. 571-588.
[11] A. Bundy, "How to improve the reliability of expert systems," in D. S. Moralee, ed.,Research and Development in Expert Systems IV. Cambridge, U. K.: Cambridge University, 1988, pp. 3-17.
[12] N. Cercone, R. Hadley, F. Martin, P. McFetridge, and T. Strzalkowski, "Designing and automating the quality assessment of a knowledge-based system: The initial automated academic advisor experience," inProc. IEEE Workshop on Principles of Knowledge-Based Systems, 1984, pp. 193-204.
[13] B. Chandrasekaran, "On evaluating AI systems for medical diagnosis."AI Mag., vol. 4, no. 2, pp. 34-37, 1983.
[14] B. Chandrasekaran, "Generic tasks in knowledge-based reasoning: High-level building blocks for expert system design,"IEEE Expert, vol. 1, pp. 23-30, 1986.
[15] C. Church, "Tracker -- Lessons from a first expert system," in M. Bramer, ed.,Practical Experience in Building Expert Systems. New York: Wiley, 1990.
[16] M. Courant and G. Guevel, "Validation d'une base de connaissances hybride avec objets et règles de production," Rep. No. 579, INRIA, Paris, France, 1986.
[17] B. J. Cragun and H. J. Steudel, "A decision-table-based processor for checking completeness and consistency in rule-based expert systems,"Int. J. Man-Machine Studies, vol. 26, pp. 633-648, 1987.
[18] J. E. Cuddigan, J. Norris, S. A. Ryan, and S. Evans, "Validating the knowledge in a computer-based consultant for nursing care," inProc. 11th Ann. Symp. on Computer Applications in Medical Care, 1987, pp. 74-78.
[19] R. Davis, "Applications of meta-level knowledge to the construction, maintenance, and use of large knowledge bases." Doctoral dissertation, Stanford Univ., Comput. Sci. Dep., Stanford, CA, 1976.
[20] E. W. Dijkstra,A Discipline of Programming. Englewood Cliffs, NJ: Prentice-Hall, 1976.
[21] D. Fontaine, P. Le Beaux, and A. Strauss, "Un système intéractif pour le maintien de la cohérence d'une base de règles," inProc. 8th Int. Workshop Expert Systems&their Applications, vol. 3, Avignon, France, 1988, pp. 49-61.
[22] J. Gaschnig, "Preliminary performance analysis of the Prospector consultant system for mineral exploration, " inProc. 6th Int. Joint Conf. on Artificial Intelligence, Tokyo, Japan, 1979, pp. 308-310.
[23] J. Gaschnig, P. Klahr, P. Pople, H. Shortliffe, and A. Terry, "Evaluation of expert systems: Issues and case studies," in D. A. Waterman, R. Hayes-Roth, and D. Lenat, eds.,Building Expert Systems. Reading, MA: Addison-Wesley, 1983, pp. 241-280.
[24] J.R. Geissmann and R.D. Schultz, "Verification and Validation of Expert Systems,"AI Expert, Vol. 3, No. 2, Feb. 1988, pp. 26- 33.
[25] A. Ginsberg, "Knowledge-base reduction: A new approach to checking knowledge bases for inconsistency and redundancy," inProc. 7th National Conf. on Artificial Intelligence, vol. 2, 1988, pp. 585-589.
[26] C. J. R. Green and M. M. Keyes, "Verification and validation of expert systems," inProc. IEEE Knowledge-Based Engineering and Expert System Conf. (WESTEX-87), Anaheim, CA, 1987, pp. 38-43.
[27] M. Greenwell,Knowledge Engineering for Expert Systems. Chichester, U.K.: Ellis Horwood, 1988.
[28] G. Guida and G. Mauri, "Evaluation of natural language processing systems: Issues and approaches,"Proc. IEEE, vol. 74, pp. 1026-1035, 1986.
[29] G. Guida and G. Mauri, "Knowledge-based systems evaluation: A survey," Tech. Rep., Univ. di Brescia, Dipartimento di Automazione Industriale, Brescia, Italy, 1992.
[30] G. Guida and L. Spampinato, "Assuring adequacy of expert systems in critical application domains: A constructive approach," in E. Hollnagel, ed.,The Reliability of Expert Systems. Chichester, U.K.: Ellis Horwood, 1989, pp. 134-167.
[31] G. Guida and C. Tasso, "Building expert systems: From life cycle to development methodology," in G. Guida and C. Tasso, eds.,Topics in Expert System Design -- Methodologies and Tools. Amsterdam, The Netherlands: North-Holland, 1989, pp. 3-24.
[32] P. Harmon and B. Sawyer,Creating Expert Systems for Business and Industry. New York: Wiley, 1990.
[33] F. Hayes-Roth, "Towards benchmarks for knowledge systems and their implications for data engineering,"IEEE Trans. Knowl. Data Eng., vol. 1, pp. 101-110, 1989.
[34] F. Hayes-Roth, D. A. Waterman, and D. B. Lenat. "An overview of expert systems, " inBuilding Expert Systems. Reading, MA: Addison-Wesley, 1983, pp. 3-29.
[35] M. S. H. Heng, "Why evolutionary development of expert systems appears to work"Future Generation Comput. Syst., vol. 3, no. 2, pp. 103-109, 1987.
[36] HMSO,Guidelines for the Introduction of Expert Systems Technology, DTI Research Technology Initiative, 1990.
[37] E. Hollnagel, "Evaluation of expert systems," in G. Guida and C. Tasso eds.,Topics in Expert System Design -- Methodologies and Tools. Amsterdam, The Netherlands: North-Holland, 1989, pp. 377-416.
[38] IBM, "AD/cycle concepts," Rep. GC26-4531-0, IBM Corp., San Jose, CA, 1989.
[39] A. Lalo, "TIBRE: Un systme expert qui teste les incohérences dans les bases de règles," inProc. 8th Int. Workshop Expert Systems and Their Applications, Avignon, France, 1988, vol. 3, pp. 63-84.
[40] K. Levi, "Expert systems should be more accurate than human experts: Evaluation procedures for human judgment and decision making,"IEEE Trans. Syst., Man, Cybern., vol. SMC-19, pp. 647-657, 1989.
[41] J. Liebowitz, "Useful approach for evaluating expert systems,"Expert Syst., vol. 3, no. 2, pp. 86-96, 1986.
[42] D. W. Loveland and M. Valtorta, "Detecting ambiguity: An example in knowledge evaluation," inProc. 8th Int. Joint Conf. on Artificial Intelligence, Karlsruhe, Federal Republic of Germany, 1983, pp. 182-184.
[43] B. Marcot, "Testing your knowledge base,"AI Expert, vol. 2, no. 8, 1987, pp. 42-47.
[44] W. Marek, "Completeness and consistency in knowledge base systems," in L. Kerschberg, ed.,Expert Database Systems. Menlo Park, CA: Benjamin/Cummings, 1986, pp. 119-126.
[45] A. Montgomery, "GEMINI: Government expert systems methodology initiative," in B. Kelly and A. Rector, eds.,Research and Development in Expert Systems. Cambridge, U.K.: Cambridge University, 1989.
[46] A. Newell, "The knowledge level,"AI Mag., vol. 2, no. 2, pp. 1-20, 1981.
[47] T.A Nguyen, "Verifying consistency of production systems," inProc. 3rd IEEE Conf. on Artificial Intelligence Applications, 1987, pp. 4-8.
[48] T. A. Nguyen, W. A. Perkins, T. J. Laffey, and D. Pecora, "Checking an expert systems knowledge base for consistency and completeness," inProc. 9th Int. Joint Conf. on Artificial Intelligence, Los Angeles, CA, 1985, pp. 375-378.
[49] T. A. Nguyen, W. A. Perkins, T. J. Laffey, and D. Pecora, "Knowledge base validation,"AI Magazine, summer, pp. 67-75, 1987.
[50] R. M. O'Keefe, O. Balci, and E. P. Smith, "Validating expert system performance,"IEEE Expert, vol. 2, pp. 81-89, 1987.
[51] D.E. O'Leary, "Validation of expert systems with applications to auditing and accounting expert systems."Decision Sci., vol. 18, no. 3, pp. 468-486, 1987.
[52] T. O'Leary, M. Goul, K. E. Moffitt, and A. E. Radwan, "Validating expert systems,"IEEE Expert, vol. 5, pp. 51-58, 1990.
[53] K. Parsaye and M. Chignell, "Expert systems for experts," inMeasuring Expert System Performance. New York: Wiley, 1988, pp. 365-374.
[54] D. Pearce, "KIC: A knowledge integrity checker," Rep. TIRM-87-025, The Turing Institute, Glasgow, U.K., 1987.
[55] W. A. Perkins, T. J. Laffey, D. Pecora, and T. A. Nguyen, "Knowledge base verification," in G. Ghida and C. Tasso, eds.,Topics in Expert System Design -- Methodologies and Tools. Amsterdam, The Netherlands: North-Holland, 1989, pp. 353-376.
[56] E. Pipard, "Détection des contradictions dans les bases de connaissance dont le language est attribut-valeur" inProc. 5th Int. Workshop Expert Systems and Their Applications, Avignon, France, 1985.
[57] E. Pipard, "Détection d'incohérences et incomplétudes dans les bases de règles: Le system INDE," inProc. 8th Int. Workshop Expert Systems and Their Applications, vol. 3, Avignon, France, 1988, pp. 13-33.
[58] D. Preece, "Verification of rule-based expert systems in wide domains," in N. Shadbolt, ed.,Research and Development in Expert Systems VI: Proc. Expert Systems 89. Cambridge, U.K: Cambridge University, 1989, pp. 66-77.
[59] A. D. Preece, "Towards a methodology for evaluating expert systems,"Expert Syst., vol. 7, no. 4, pp. 215-223, 1990.
[60] A. D. Preece, "DISPLAN: Designing a usable medical expert system," in D. Berry and A. Hart, eds.,Expert Systems: Human IssuesKogan-Page, 1990.
[61] J. A. Reggia, "Evaluation of medical expert systems: A case study in performance assessment," inProc. 9th Ann. Symp. on Computer Applications in Medical Care, 1985, pp. 287-291.
[62] M. C. Rousset, "Sur la validitédes bases de connaissances: Le systeme COVADIS," inProc. 7th Int. Workshop Expert Systems and Their Applications, Avignon, France, 1987.
[63] M. C. Rousset, "Sur la cohérence et la validitédes bases de règles dans les systèmes experts: Le système COVADIS,"Génie Logiciel Systèmes Experts, vol. 10, pp. 76-80, 1988.
[64] J. Rushby, "Quality measures and assurance for AI software," NASA Rep. CR-4187, SRI International, Menlo Park, CA, 1988.
[65] M. L. Shooman,Software Engineering. New York: McGraw-Hill, 1983.
[66] R. A. Stachowitz, C. L. Chang, T. Stock, and J. B. Combs, "Building validation tools for knowledge-based systems," inProc. 1st Ann. Workshop on Space Operations, Automation and Robotics, Houston, TX, 1987.
[67] R. A. Stachowitz and J. B. Combs, "Validation of expert systems," inProc. 20th Ann. Hawaii Int. Conf. on System Sciences, Kona, Hawaii, vol. 1, 1987, pp. 686-695.
[68] R. A. Stachowitz, J. B. Combs, and C. L. Chang, "Validation of knowledge-based systems," inProc. AIAA/NASA/USAF Symp. on Automation, Robotics and Advanced Computing, Arlington, VA, 1987.
[69] M. Stefik, J. Aikins, R. Balzer, J. Benoit, L. Birnbaum, F. Hayes-Roth, and E. Sacerdoti, "The architecture of expert systems," in F. Hayes-Roth, D. A. Waterman, and D. B. Lenat, eds.,Building Expert Systems. Reading, MA: Addison-Wesley, 1983, pp. 89-126.
[70] C. Y. Suen, P. D. Grogono, R. Shinghal, and F. Coallier, "Verifying, validating, and measuring performance of expert systems,"Expert Syst. Applications, vol. 1, no. 2, pp. 93-102, 1990.
[71] M. Suwa, A. C. Scott, and E. H. Shortliffe, "An approach to verifying completeness and consistency in a rule-based expert system,"AI Mag., vol. 3, no. 4, pp. 16-21, 1982.
[72] M. Suwa, A. C. Scott, and E. H. Shortliffe, "Completeness and consistency in a rule-based system," in B.G. Buchanan and E. H. Shortliffe, eds.,Rule-Based Expert Systems -- The MYCIN Experiments at the Stanford Heuristic Programming Project. Reading, MA: Addison-Wesley, 1984, pp. 159-170.
[73] A. M. Turing, "Computing machinery and intelligence,"Mind, pp. 433-460, 1950.
[74] D. A. Waterman,A Guide to Expert Systems. Reading, MA; Addison-Wesley, 1985.
[75] N. Wirth, "Program development by stepwise refinement,"Commun. ACM, vol. 14, no. 4, pp. 221-227, Apr. 1971.
[76] J. Wyatt, "The evaluation of clinical decision support systems: A discussion of the methodology used in the ACORN project," inProc. AIME 87, 1987, pp. 15-24.
[77] E. Yourdon,Techniques of Program Structure and Design. Englewood Cliffs, NJ: Prentice-Hall, 1975.
[78] L. Yu, L. M. Fagan, S. W. Bennet, W. J. Clancey, A. C. Scott, J. F. Hannigan, B.G. Buchanan, and S. N. Cohen, "An evaluation of MYCIN's advice," in B.G. Buchanan and E. H. Shortliffe, eds.,Rule-Based Expert Systems -- The MYCIN Experiments at the Stanford Heuristic ProgrammingProject. Reading, MA: Addison-Wesley, 1984, pp. 589-596.

Index Terms:
knowledge-based system; evaluation methods; KBS evaluation; general evaluation methodology; practical application; knowledge based systems; performance evaluation
Citation:
G. Guida, G. Mauri, "Evaluating Performance and Quality of Knowledge-Based Systems: Foundation and Methodology," IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 2, pp. 204-224, April 1993, doi:10.1109/69.219731
Usage of this product signifies your acceptance of the Terms of Use.