This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Toward Intelligent Assistance for a Data Mining Process: An Ontology-Based Approach for Cost-Sensitive Classification
April 2005 (vol. 17 no. 4)
pp. 503-518
A data mining (DM) process involves multiple stages. A simple, but typical, process might include preprocessing data, applying a data mining algorithm, and postprocessing the mining results. There are many possible choices for each stage, and only some combinations are valid. Because of the large space and nontrivial interactions, both novices and data mining specialists need assistance in composing and selecting DM processes. Extending notions developed for statistical expert systems we present a prototype Intelligent Discovery Assistant (IDA), which provides users with 1) systematic enumerations of valid DM processes, in order that important, potentially fruitful options are not overlooked, and 2) effective rankings of these valid processes by different criteria, to facilitate the choice of DM processes to execute. We use the prototype to show that an IDA can indeed provide useful enumerations and effective rankings in the context of simple classification processes. We discuss how an IDA could be an important tool for knowledge sharing among a team of data miners. Finally, we illustrate the claims with a demonstration of cost-sensitive classification using a more complicated process and data from the 1998 KDDCUP competition.

[1] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “The KDD Process for Extracting Useful Knowledge from Volumes of Data,” Comm. ACM, vol. 39, pp. 27-34, 1996.
[2] P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, and R. Wirth, “CRISP-DM 1.0: Step-by-Step Data Mining Guide,” SPSS Inc., 2000, http://www.crisp-dm.orgCRISPWP-0800.pdf.
[3] T. Senator, “Ongoing Management and Application of Discovered Knowledge in a Large Regulatory Organization: A Case Study of the Use and Impact of NASD Regulation's Advanced Detection System (RADS),” Proc. Sixth Int'l Conf. Knowledge Discovery and Data Mining (KDD 2000), pp. 44-53, 2000.
[4] R. St. Amant and P.R. Cohen, “Intelligent Support for Exploratory Data Analysis,” J. Computational and Graphical Statistics, vol. 7, pp. 545-558, 1998.
[5] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco: Morgan Kaufmann, 1999.
[6] K.T. Ulrich and S.D. Eppinger, Product Design and Development, third ed. Boston: McGraw-Hill/Irwin, 2004.
[7] R. Kohavi, C.E. Brodley, B. Frasca, L. Mason, and Z. Zheng, “KDD-Cup 2000 Organizers' Report: Peeling the Onion,” SIGKDD Explorations, vol. 2, pp. 86-93, 2000.
[8] M. Ghallab, C. Howe, C. Knoblock, D. McDermott, A. Ram, M. Veloso, D. Weld, and D. Wilkins, “lPDDL— The Planning Domain Definition Language,” New Haven, CT TR-98-003/DCS TR-1165, ftp.cs.yale.edu/pub/mcdermott/siftwarepddl.tar.gz , 1998.
[9] A. Ankolekar, M. Burstein, J.R. Hobbs, O. Lassila, D.L. Martin, S.A. McIlraith, S. Narayanan, M. Paolucci, T. Payne, K. Sycara, and H. Zeng, “DAML-S: Semantic Markup for Web Services,” Proc. Semantic Web Working Symp., 2001.
[10] E. Christensen, F. Curbera, G. Meredith, and S. Weerawarana, “Web Services Description Language 1.1,” World Wide Web Consortium (W3C) Technical Note, www.w3.org/TRwsdl, 2001.
[11] P. Turney, “Cost-Sensitive Learning Bibliography,” Online bibliography, NRC Inst. for Information Technology, Ottawa, Canada, 2001, http://members.rogers.com/peter.turney/bibliographies cost-sensitive.html.
[12] J.R. Quinlan, “Simplifying Decision Trees,” Int'l J. Man-Machine Studies, vol. 27, pp. 221-234, 1987.
[13] T.S. Lim, W.Y. Loh, and Y.S. Shih, “A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms,” Machine Learning, vol. 40, pp. 203-228, 2000.
[14] F. Provost, D. Jensen, and T. Oates, “Efficient Progressive Sampling,” Proc. Fifth Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp. 23-32, 1999.
[15] R. Kohavi and M. Sahami, “Error-Based and Entropy-Based Discretization of Continuous Features,” Proc. Second Int'l Conf. on Knowledge Discovery and Data Mining (KDD '96), pp. 114-119, 1996.
[16] F. Provost and V. Kolluri, “A Survey of Methods for Scaling Up Inductive Algorithms,” Data Mining and Knowledge Discovery, vol. 3, pp. 131-169, 1999.
[17] T. Oates and D. Jensen, “The Effects of Training Set Size on Decision Tree Complexity,” Proc. 14th Int'l Conf. Machine Learning (ICML '97), pp. 254-262, 1997.
[18] C. Perlich, F. Provost, and J.S. Simonoff, “Tree Induction vs. Logistic Regression: A Learning-Curve Analysis,” J. Machine Learning Research, vol. 4, pp. 211-255, 2004.
[19] K. Morik and M. Scholz, “The MiningMart Approach to Knowledge Discovery in Databases,” Intelligent Technologies for Information Analysis, N. Zhong and J. Liu, eds., Springer, pp. 47-64, 2003.
[20] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” Univ. of California, Dept. of Information and Computer Science, 2000, http://www.ics.uci.edu/~mlearnMLRepository. html .
[21] N. Agrawal, “Urban Science Wins the KDD-98 Cup. A Second Straight Victory for GainSmarts,” 1998, http://www.kdnuggets. com/meetings/kdd98 gain-kddcup98-release.html.
[22] B. Zadrozny and C. Elkan, “Learning and Making Decisions When Costs and Probabilities are Both Unknown,” Proc. Seventh Int'l Conf. Knowledge Discovery and Data Mining (KDD 2001), pp. 204-213, 2001.
[23] P.B. Brazdil, “Data Transformation and Model Selection by Experimentation and Meta-Learning,” Proc. 10th European Conf. Machine Learning (ECML '98): Workshop Upgrading Learning to the Meta-Level: Model Selection and Data Transformation, pp. 11-17, 1998.
[24] K. Morik, “The Representation Race— Preprocessing for Handling Time Phenomena,” Proc. 11th European Conf. Machine Learning (ECML 2000), pp. 4-19, 2000.
[25] R. Engels, “Planning Tasks for Knowledge Discovery in Databases; Performing Task-Oriented User-Guidance,” Proc. Second Int'l Conf. Knowledge Discovery in Databases (KDD '96), pp. 170-175, 1996.
[26] R. Engels, G. Lindner, and R. Studer, “A Guided Tour through the Data Mining Jungle,” Proc. Third Int'l Conf. Knowledge Discovery in Databases (KDD '97), pp. 163-166, 1997.
[27] R. Wirth, C. Shearer, U. Grimmer, T. Reinartz, J. Schlösser, C. Breitner, R. Engels, and G. Lindner, “Towards Process-Oriented Tool Support for KDD,” Proc. First European Symp. Principles of Data Mining and Knowledge Discovery (PKDD '97), pp. 55-64, 1997.
[28] F. Verdenius and R. Engels, “A Process Model for Developing Inductive Applications,” Proc. Seventh Belgian-Dutch Conf. Machine Learning, pp. 119-128, 1997.
[29] B. Chandrasekaran, T.R. Johnson, and J.W. Smith, “Task-Structure Analysis for Knowledge Modeling,” Comm. ACM, vol. 35, pp. 124-137, 1992.
[30] W. Buntine, B. Fischer, and T. Pressburger, “Towards Automated Synthesis of Data Mining Programs,” Proc. Fifth Int'l Conf. Knowledge Discovery and Data Mining (KDD '99), pp. 372-376, 1999.
[31] S. Craw, D. Sleeman, N. Graner, M. Rissakis, and S. Sharma, “Consultant: Providing Advice for the Machine Learning Toolbox,” Research and Development in Expert Systems IX: Proc. Expert Systems, 12th Ann. Technical Conf. British Computer Soc. Specialist Group on Expert Systems, pp. 5-23, 1992.
[32] R. Davis, “Interactive Transfer of Expertise,” Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project, The Addison-Wesley series in artificial intelligence, B.G. Buchanan and E.H. Shortliffe, eds., pp. 171-205, Reading, Mass.: Addison-Wesley, 1984.
[33] C.E. Brodley, “Recursive Automatic Bias Selection for Classifier Construction,” Machine Learning, special issue on bias evaluation and selection, vol. 20, pp. 63-94, 1995.
[34] D. Michie, D. Spiegelhalter, and C. Taylor, Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.
[35] P. Brazdil, J. Gama, and B. Henery, “Characterizing the Applicability of Classification Algorithms Using Meta-Level Learning,” Proc. European Conf. Machine Learning (ECML '94), pp. 83-102, 1994.
[36] J. Gama and P. Brazdil, “Characterization of Classification Algorithms,” Proc. Seventh Portuguese Conf. Artificial Intelligence, pp. 189-200, 1995.
[37] M. Hilario and A. Kalousis, “Fusion of Meta-Knowledge and Meta-Data for Case-Based Model Selection,” Technical Report, Univ. of Geneva, Geneva UNIGE-AI-01-01, 2001.
[38] B. Buchanan, C. Johnson, T. Mitchell, and R. Smith, “Models of Learning Systems,” Encyclopedia of Computer Science and Technology, J. Belzer, A.G. Holzman, and A. Kent, eds., pp. 24-51, New York: M. Dekker, 1975.
[39] D. Gordon and M. desJardins, “Evaluation and Selection of Biases in Machine Learning,” Machine Learning, vol. 20, pp. 5-22, 1995.
[40] D. Tcheng, B. Lambert, S. Lu, and L. Rendell, “Building Robust Learning Systems by Combining Induction and Optimization,” Proc. Int'l Joint Conf. Artificial Intelligence (IJCAI '89), pp. 806-812, 1989.
[41] F.J. Provost and B.G. Buchanan, “Inductive Policy: The Pragmatics of Bias Selection,” Machine Learning, vol. 20, pp. 35-61, 1995.
[42] P. Brazdil and C. Soares, “A Comparison of Ranking Methods for Classification Algorithm Selection,” Proc. 11th European Conf. Machine Learning and Data Mining (ECML 2000), pp. 63-74, 2000.
[43] C. Soares, P. Brazdil, and J. Costa, “Improved Statistical Support for Matchmaking: Rank Correlation Taking Rank Importance into Account,” VII Jornadas de Classificao e Analise de Dados, pp. 72-75, 2001.
[44] B. Pfahringer, H. Bensusan, and C. Giraud-Carrier, “Meta-Learning by Landmarking Various Learning Algorithms,” Proc. Seventh Int'l Conf. Machine Learning (ICML 2000), pp. 743-750, 2000.
[45] J. Fürnkranz and J. Petrak, “An Evaluation of Landmarking Variants,” Proc. 12th European Conf. Machine Learning and Data Mining/Fifth European Conf. Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2001) Workshop on Integrating Aspects of Data Mining, Decision Support, and Meta-Learning, pp. 57-68, 2001.
[46] J. Petrak, “Fast Subsampling Performance Estimates for Classification Algorithm Selection,” Austrian Research Institute for Artificial Intelligence TR-2000-07, 2000.
[47] W.A. Gale, Artificial Intelligence and Statistics. Reading, Mass.: Addison-Wesley Pub. Co., 1986.
[48] D. Hand, “Statistical Expert Systems,” Chance, vol. 7, pp. 28-34, 1994.
[49] R. Oldford and S. Peters, “Implementation and Study of Statistical Strategy,” Artificial Intelligence and Statistics, W.A. Gale, ed., pp. 335-353, Reading, Mass.: Addison-Wesley, 1985.
[50] R. Oldford, “Computational Thinking for Statisticians: Training by Implementing Statistical Strategy,” Proc. 29th Symp. Interface, 1997.
[51] D. Lubinsky and D. Pregibon, “DataAnalysis as Search,” J. Econometrics, vol. 38, pp. 247-268, 1988.
[52] M. Scholz and T. Euler, Documentation of the MiningMart Meta Model (M4),” TR12-05, IST Project MiningMart, IST-11993, 2002.
[53] R. Kerber, H. Beck, T. Anand, and B. Smart, “Active Templates: Comprehensive Support for the Knowledge Discovery Process,” Proc. Fourth Int'l Conf. Knowledge Discovery and Data Mining (KDD '98), pp. 244-248, 1998.
[54] A. Suyama and T. Yamaguchi, “Specifying and Learning Inductive Learning Systems Using Ontologies,” Working Notes from the Proc. 1998 AAAI Workshop Methodology of Applying Machine Learning: Problem Definition, Task Decomposition and Technique Selection, pp. 29-36, 1998.
[55] S. Nishisato, Elements of Dual Scaling: An Introduction to Practical Data Analysis. Hillsdale, N.J.: L. Erlbaum Associates, 1994.
[56] B.T. Pentland, “Organizing Moves in Software Support Hot Lines,” Administrative Science Quarterly, vol. 37, pp. 527-548, 1992.
[57] M.S. Ackerman and E. Mandel, “Memory in the Small: Combining Collective Memory and Task Support for a Scientific Community,” J. Organizational Computing and Electronic Commerce, vol. 9, pp. 105-127, 1999.
[58] R. Kohavi and G.H. John, “Wrappers for Feature Subset Selection,” Artificial Intelligence, vol. 97, pp. 273-324, 1997.
[59] J.A. Hoeting, D. Madigan, A.E. Raftery, and C.T. Volinsky, “Bayesian Model Averaging,” Statistical Science, vol. 14, pp. 382-401, 1999.
[60] D. Lenat, “AM: Discovery in Mathematics as Heuristic Search,” Knowledge-Based Systems in Artificial Intelligence, McGraw-Hill Advanced Computer Science Series, D. Lenat and R. Davis, eds., pp. 3-225, McGraw-Hill, 1982.
[61] G. Livingston, J.M. Rosenberg, and B.G. Buchanan, “An Agenda- and Justification-Based Framework for Discovery Systems,” J. Knowledge and Information Systems, vol. 5, pp. 133-161, 2003.

Index Terms:
Cost-sensitive learning, data mining, data mining process, intelligent assistants, knowledge discovery, knowledge discovery process, machine learning, metalearning.
Citation:
Abraham Bernstein, Foster Provost, Shawndra Hill, "Toward Intelligent Assistance for a Data Mining Process: An Ontology-Based Approach for Cost-Sensitive Classification," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 503-518, April 2005, doi:10.1109/TKDE.2005.67
Usage of this product signifies your acceptance of the Terms of Use.