This Article 
 Bibliographic References 
 Add to: 
Database Mining: A Performance Perspective
December 1993 (vol. 5 no. 6)
pp. 914-925

The authors' perspective of database mining as the confluence of machine learning techniques and the performance emphasis of database technology is presented. Three classes of database mining problems involving classification, associations, and sequences are described. It is argued that these problems can be uniformly viewed as requiring discovery of rules embedded in massive amounts of data. A model and some basic operations for the process of rule discovery are described. It is shown how the database mining problems considered map to this model, and how they can be solved by using the basic operations proposed. An example is given of an algorithm for classification obtained by combining the basic rule discovery operations. This algorithm is efficient in discovering classification rules and has accuracy comparable to ID3, one of the best current classifiers.

[1] R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami, "An Interval Classifier for Database Mining Applications,"VLDB-92, Vancouver, British Columbia, Canada, 1992, 560-573.
[2] L. Brieman, J. H. Friedman, R. A. Olshen, and C. J. Stone,Classification and Regression Trees, Wadsworth, Belmont, 1984.
[3] R. Brice and W. Alexander, "Finding Interesting Things in Lots of Data."23rd Hawaii Int. Conf. Syst. Sci., Kona. Hawaii, Jan. 1990.
[4] W. Buntine,About the IND Tree Package, NASA Ames Research Center, Moffett Field, California, Sept. 1991.
[5] W. Buntine and M. Del Alto (Eds.),Collected Notes on the Workshop for Pattern Discovery in Large Databases. Technical Report FIA 9l- 07, NASA Ames Research Center, Moffett Field, California, Apr. 1991.
[6] P.A. Chou, "Applications of information theory to pattern recognition and the design of decision trees and trellises," Ph.D. dissertation, Stanford Univ., Stanford, CA, June 1988.
[7] G. R. Dattatreya and L. N. Kanal, "Decision Trees in Pattern Recognition," inProgress in Pattern Recognition 2, L. N. Kanal and A. Rosenfeld, Eds. North Holland: Elsevier, 1985.
[8] J. Han, Y. Cai, and N. Cercone, "Knowledge Discovery in Databases: An Attribute-Oriented Approach,"VLDB-92. Vancouver, British Columbia, Canada, 1992, pp. 547-559.
[9] R. Krishnamurthy and T. Imielinski, "Practitioner Problems in Need of Database Research: Research Directions in Knowledge Discovery,"SIGMOD Rec., vol. 20, no. 3, Sept. 1991, pp. 76-78.
[10] R. P. Lippman, "An introduction to computing with neural nets,"IEEE ASSP Msg., vol. 4, pp. 4-22, 1987.
[11] David J. Lubinsky, "Discovery from Databases: A Review of AI and Statistical Techniques,"IJCAI-89 Workshop on Knowledge Discovery in Databases, Detroit, August 1989, pp. 204-218.
[12] T. Anwar, H. Beck, and S. Navathe, "Knowledge mining by imprecise querying: A classification-based approach," inProc. IEEE 8th Int. Conf. Data Eng., Tempe, AZ, 1992.
[13] J. R. Quinlan, "Induction of decision trees,"Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.
[14] J. Quinlan, "Simplifying decision trees,"Int. J. Man-Machine Studies, vol. 27, pp. 221-234, 1987.
[15] G. Piatetsky-Shapiro and W. Frawley, eds..Proc. IJCAI-89 Workshop on Knowledge Discovery in Databases, Detroit, Michigan, Aug. 1989.
[16] G. Piatetsky-Shapiro,Proc. AAAI-91 Workshop on Knowledge Discovery in Databases, Anaheim, California, July 1991.
[17] G. Piatetsky-Shapiro, "Discovery, Analysis, and Presentation of Strong Rules," inKnowledge Discovery in Databases. Cambridge, MA: AAAI/MIT, 1991, pp. 229-248.
[18] G. Piatesky-Shapiro,Knowledge Discovery in Databases. Cambridge, MA: AAAI/ MIT Press, 1991.
[19] S. Tsur, "Data Dredging,"IEEE Data Eng. Bull., vol. 13, no. 4, Dec. 1990, pp. 58-63.
[20] J. D. Ullman,Database and Knowledge-base Systems. Rockville, MD: Computer Science Press, 1988.

Index Terms:
database mining; performance perspective; machine learning techniques; classification; associations; sequences; rule discovery; ID3; decision trees; knowledge discovery; DBMS mining; database management systems; decision theory; knowledge based systems; learning (artificial intelligence); performance evaluation
R. Agrawal, T. Imielinski, A. Swami, "Database Mining: A Performance Perspective," IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 6, pp. 914-925, Dec. 1993, doi:10.1109/69.250074
Usage of this product signifies your acceptance of the Terms of Use.