This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The Minimum Consistent Subset Cover Problem: A Minimization View of Data Mining
March 2013 (vol. 25 no. 3)
pp. 690-703
Byron J. Gao, Texas State University - San Marcos, San Marcos
Martin Ester, Simon Fraser University, Burnaby
Hui Xiong, Rutgers, the State University of New Jersey, Newark
Jin-Yi Cai, University of Wisconsin - Madison, Madison
Oliver Schulte, Simon Fraser University, Burnaby
In this paper, we introduce and study the minimum consistent subset cover (MCSC) problem. Given a finite ground set X and a constraint t, find the minimum number of consistent subsets that cover X, where a subset of X is consistent if it satisfies t. The MCSC problem generalizes the traditional set covering problem and has minimum clique partition (MCP), a dual problem of graph coloring, as an instance. Many common data mining tasks in rule learning, clustering, and pattern mining can be formulated as MCSC instances. In particular, we discuss the minimum rule set (MRS) problem that minimizes model complexity of decision rules, the converse k-clustering problem that minimizes the number of clusters, and the pattern summarization problem that minimizes the number of patterns. For any of these MCSC instances, our proposed generic algorithm CAG can be directly applicable. CAG starts by constructing a maximal optimal partial solution, then performs an example-driven specific-to-general search on a dynamically maintained bipartite assignment graph to simultaneously learn a set of consistent subsets with small cardinality covering the ground set.
Index Terms:
Data mining,Pattern recognition,Complexity theory,Minimization,Decision trees,Graph coloring,Clustering algorithms,pattern summarization,Minimum consistent subset cover,set covering,graph coloring,minimum clique partition,minimum star partition,minimum rule set,converse k-clustering
Citation:
Byron J. Gao, Martin Ester, Hui Xiong, Jin-Yi Cai, Oliver Schulte, "The Minimum Consistent Subset Cover Problem: A Minimization View of Data Mining," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 3, pp. 690-703, March 2013, doi:10.1109/TKDE.2011.260
Usage of this product signifies your acceptance of the Terms of Use.