This Article 
 Bibliographic References 
 Add to: 
Abstract-Driven Pattern Discovery in Databases
December 1993 (vol. 5 no. 6)
pp. 926-938

The problem of discovering interesting patterns in large volumes of data is studied. Patterns can be expressed not only in terms of the database schema but also in user-defined terms, such as relational views and classification hierarchies. The user-defined terminology is stored in a data dictionary that maps it into the language of the database schema. A pattern is defined as a deductive rule expressed in user-defined terms that has a degree of uncertainty associated with it. Methods are presented for discovering interesting patterns based on abstracts which are summaries of the data expressed in the language of the user.

[1] Y. Cai, N. Cercone, and J. Han, "Attribute-oriented induction in relational databases, " inKnowledge Discovery in Databases, G. Piatetsky-Shapiro and W. J. Frawley, eds. Cambridge, MA: AAAI/ MIT, 1991.
[2] W. J. Frawley, G. Piatetsky-Shapiro, and C. J. Matheus, "Knowledge discovery in databases: an overview," inKnowledge Discovery in Databases, G. Piatetsky-Shapiro and W. J. Frawley, eds. Cambridge, MA: AAAI/MIT, 1991.
[3] J. Han, Y. Cai, and N. Cercone, "Knowledge Discovery in Databases: An Attribute-Oriented Approach,"VLDB-92. Vancouver, British Columbia, Canada, 1992, pp. 547-559.
[4] S. K. Kachigan,Statistical Analysis. New York: Radius, 1986.
[5] P. Langeley, G. L. Bradshaw, and H. Simon, "BACON.5: The discovery of scientific laws," inProc. IJCAI Conf., 1981.
[6] D. Lenat, "Automated theory formation in mathematics," inProc. IJCAI Conf., 1977.
[7] T. M. Mitchell, R. H. Keller, and S. T. Kedar-Cabelli, "Explanation-based generalization: A unifying view,"Machine Learning, vol. 1, no. 1, pp. 47-80, 1986.
[8] I. S. Mumick, H. Pirahesh, and R. Ramakrishnan, "The magic of duplicates and aggregates," inProc. VLDB Conf., 1990, pp. 264- 277.
[9] J. Pearl and T. S. Verma, "A Theory of Inferred Causation," inProc. 2nd Int. Conf. Principles of Knowledge Representation and Reasoning, 1991, pp. 441-452.
[10] G. Piatetsky-Shapiro, November 1991. Personal communication.
[11] G. Piatetsky-Shapiro and W. Frawley,Knowledge Discovery in Databases. Menlo Park, CA: AAAI Press/MIT Press, 1991.
[12] J. R. Quinlan, "Induction of decision trees,"Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.
[13] H. Theil,Principles of Econometrics. New York: Wiley, 1971.
[14] R. Uthurusamy, U. M. Fayyad, and S. Spangler, "Learning useful rules from inconclusive data," inKnowledge Discovery in Databases, G. Piatetsky-Shapiro and W. J. Frawley, eds. Cambridge, MA: AAAI/MIT, 1991.
[15] A. Walker, "On retrieval from a small version of a large database," inProc. VLDB Conf., 1980.
[16] P. H. Winston, "Learning structural descriptions from examples," inThe Psychology of Computer Vision, P. H. Winston, ed. New York: McGraw-Hill, 1975.
[17] P. Winston,Art. Intell.Reading: MA: Addison-Wesley, 1984.

Index Terms:
abstract-driven pattern discovery; databases; database schema; user-defined terms; relational views; classification hierarchies; data dictionary; deductive rule; data abstraction; generalization; abstract data types; classification; deductive databases; knowledge based systems; user interfaces
V. Dhar, A. Tuzhilin, "Abstract-Driven Pattern Discovery in Databases," IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 6, pp. 926-938, Dec. 1993, doi:10.1109/69.250075
Usage of this product signifies your acceptance of the Terms of Use.