Induction. The most common perspective, induction—proceeding from the specific to the general—has its roots in AI and machine learning. It answers questions like "given 10 specific examples of good travel destinations, what are the characteristics of a favorable tourist attraction?"
Thus, induction is typically implemented as a search through the space of possible hypotheses. Such searches usually employ some special characteristic or aspect to arrive at a good generalization—"tropical islands are favorable," for example. Systems such as Progol (not Prolog), FOIL (First Order Inductive Learning), and Golem view induction as reversing the deduction process in first-order logic inference.
Compression. Of course, several general concepts can apply to one set of data, so mining techniques typically look for the most succinct or easily described pattern. This principle, known as Occam's Razor, effectively equates mining to compression, where the learned patterns are in some sense "smaller to describe" than exhaustively enumerating the original data itself.
The emergence of computational learning theory in the 1980s and the feasibility of models such as MDL (the Minimum Description Length principle) provided a solid theoretical foundation to this perspective. Several commercial data-mining systems employ this view of data mining as compression to determine the effectiveness of mined patterns: If a pattern mined from 10 data points is itself 16 "features" long, then mining might provide no tangible benefit.
Querying. This unique perspective comes from the database systems community. Since most business data resides in industrial databases and warehouses, commercial companies view mining as a sophisticated form of database querying. Research based on this perspective seeks to enhance the expressiveness of query languages like SQL to allow queries like "Find all the customers with deviant transactions."
Other database perspectives concentrate on enhancing the underlying data model. (The relational model is good for abstracting and querying data. Is it also a good model for mining?) Or they offer metaquery languages ("Find me a pattern that connects something about writers' backgrounds and the characters in their novels"). Still others concentrate on developing interactive techniques for exploring databases.
Approximation. This view of mining starts with an accurate (exact) model of the data and deliberately introduces approximations in the hope of finding some hidden structure in the data.
Such approximations might involve dropping higher-order terms in a harmonic expansion or collapsing two or more nearby entities into one—viewing three connected nodes as one in a graph, for instance.
One technique that has found extensive use in document retrieval is called Latent Semantic Indexing. This technique, patented by Bellcore, uses linear algebraic matrix transformations and approximations to identify hidden structures in word usage, thus enabling searches that go beyond simple keyword matching. Related techniques have also been used in Karhunen-Loeve expansions for signal processing and principal-component analysis in statistics.
Search. This perspective relates to induction, but focuses on efficiency. Our favorite example is the widely popular work on association rules at IBM Almaden that uses the forward-pruning nature of patterns (frequent itemsets) to restrict the space of possible patterns.
• their induced representations (decision trees, rules, correlations, deviations, trends, or associations);
• the data they operate on (continuous, time series, discrete, labeled, or nominal); or
• application domains (finance, economic models, biology, Web log mining, or semistructured models for abstracting from Web pages).