This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Mining Low-Support Discriminative Patterns from Dense and High-Dimensional Data
February 2012 (vol. 24 no. 2)
pp. 279-294
Gang Fang, University of Minnesota, Minneapolis
Gaurav Pandey, University of California Berkeley, Berkeley
Wen Wang, University of Minnesota, Minneapolis
Manish Gupta, Oracle India Private Ltd.
Michael Steinbach, University of Minnesota, Minneapolis
Vipin Kumar, University of Minnesota, Minneapolis
HASH(0x295e1ec)

[1] Mental Health Services Administration, "The Role of Biomarkers in the Treatment of Alcohol Use Disorders," Substance Abuse Treatment Advisory, vol. 5, no. 4, pp. 4206-4223, 2006.
[2] R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," Proc. Very Large Data Bases (VLDB), pp. 487-499, 1994.
[3] M. Ashburner et al., "Gene Ontology: Tool for the Unification of Biology," Nature Genetics, vol. 25, no. 1, pp. 25-29, 2000.
[4] A. Asuncion and D. Newman, UCI Machine Learning Repository, http://mlearn.ics.uci.eduMLRepository.html , 2007.
[5] S. Bay, and M. Pazzani, "Detecting Group Differences: Mining Contrast Sets," Data Mining and Knowledge Discovery, vol. 5, no. 3, pp. 213-246, 2001.
[6] R.J. Bayardo, "Efficiently Mining Long Patterns from Databases," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 85-93, 1998.
[7] S. Brin, R. Motwani, and C. Silverstein, "Beyond Market Baskets: Generalizing Association Rules to Correlations," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 265-276, 1997.
[8] C. Carlson et al., "Mapping Complex Disease Loci in Whole-genome Association Studies," Nature, vol. 429, no. 6990, pp. 446-452, 2004.
[9] H. Cheng, X. Yan, J. Han, and C.-W. Hsu, "Discriminative Frequent Pattern Analysis for Effective Classification," Proc. Int'l Conf. Data Eng. (ICDE), pp. 716-725, 2007.
[10] H. Cheng, X. Yan, J. Han, and P. Yu, "Direct Discriminative Pattern Mining for Effective Classification," Proc. Int'l Conf. Data Eng. (ICDE), pp. 169-178, 2008.
[11] G. Cong, K. Tan, A. Tung, and X. Xu, "Mining Top-K Covering Rule Groups for Gene Expression Data," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 670-681, 2005.
[12] C. Creighton and S. Hanash, "Mining Gene Expression Databases for Association Rules," Bioinformatics, vol. 19, no. 1, pp. 79-86, 2003.
[13] M. Deshpande, M. Kuramochi, N. Wale, and G. Karypis, "Frequent Sub-Structure Based Approaches for Classifying Chemical Compounds," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 8, pp. 1036-1050, Aug. 2005.
[14] G. Dong and J. Li, "Efficient Mining of Emerging Patterns: Discovering Trends and Differences," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 43-52, 1999.
[15] W. Fan, K. Zhang, H. Cheng, J. Gao, X. Yan, J. Han, P.S. Yu, and O. Verscheure, "Direct Mining of Discriminative and Essential Graphical and Itemset Features via Model-Based Search Tree," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 230-238, 2008.
[16] R. Fisher, "On the Interpretation of $\chi^2$ from Contingency Tables, and the Calculation of P," J. Royal Statistical Soc., vol. 85, pp. 87-94, 1922.
[17] G. Garriga, P. Kralj, and N. Lavrač, "Closed Sets for Labeled Data," J. Machine Learning Research, vol. 9, pp. 559-580, 2008.
[18] A. Gionis et al., "Assessing Data Mining Results via Swap Randomization," ACM Trans. Knowledge Discovery from Data, vol. 1, no. 3, p. 14, 2007.
[19] G. Grahne and J. Zhu, "Efficiently Using Prefix-Trees in Mining Frequent Itemsets," Proc. Workshop Frequent Itemset Mining Implementations, 2003.
[20] J. Han et al., "Frequent Pattern Mining: Current Status and Future Directions," Data Mining and Knowledge Discovery, vol. 15, pp. 55-86, 2007.
[21] M.E. Higgins, M. Claremont, J.E. Major, C. Sander, and A.E. Lash, "CancerGenes: A Gene Selection Resource for Cancer Genome Projects," Nucleic Acids Research, vol. 35, no. supplement 1, pp. D721-D726, 2007.
[22] T. Hwang, H. Sicotte, Z. Tian, B. Wu, J. Kocher, D. Wigle, V. Kumar, and R. Kuang, "Robust and Efficient Identification of Biomarkers by Classifying Features on Graphs," Bioinformatics, vol. 24, no. 18, pp. 2023-2029, 2008.
[23] S. Jaroszewicz and D.A. Simovici, "Pruning Redundant Association Rules Using Maximum Entropy Principle," Proc. Pacific-Asia Conf. Advances in Knowledge Discovery and Data Mining (PAKDD), pp. 135-147, May 2002.
[24] P. Kralj, N. Lavrac, D. Gamberger, and A. Krstacic, "Contrast Set Mining for Distinguishing between Similar Diseases," Proc. Conf. Artificial Intelligence in Medicine, pp. 109-118, 2007.
[25] P. Kralj Novak, N. Lavrač, D. Gamberger, and A. Krstačić, "CSM-SD: Methodology for Contrast Set Mining through Subgroup Discovery," J. Biomedical Informatics, vol. 42, no. 1, pp. 113-122, 2009.
[26] J. Li, G. Dong, and K. Ramamohanarao, "Making Use of the Most Expressive Jumping Emerging Patterns for Classification," Knowledge and Information Systems, vol. 3, no. 2, pp. 131-145, 2001.
[27] J. Li, G. Liu, and L. Wong, "Mining Statistically Important Equivalence Classes and Delta-Discriminative Emerging Patterns," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 430-439, 2007.
[28] W. Li, J. Han, and J. Pei, "CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules," Proc. IEEE Int'l Conf. Data Mining (ICDM), pp. 369-376, 2001.
[29] B. Liu, W. Hsu, and Y. Ma, "Integrating Classification and Association Rule Mining," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 80-86, 2001.
[30] D. Lo, H. Cheng, J. Han, S. Khoo, and C. Sun, "Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 557-566, 2009.
[31] E. Loekito and J. Bailey, "Fast Mining of High Dimensional Expressive Contrast Patterns Using Zero-Suppressed Binary Decision Diagrams," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 307-316, 2006.
[32] T. McIntosh and S. Chawla, "High Confidence Rule Mining for Microarray Analysis," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 4, pp. 611-623, Oct.-Dec. 2007.
[33] R. Miller, Simultaneous Statistical Inference. Springer-Verlag Inc., 1981.
[34] S. Morishita and J. Sese, "Transversing Itemset Lattices with Statistical Metric Pruning," Proc. ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS), pp. 226-236, 2000.
[35] P. Novak, N. Lavrac, and G. Webb, "Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining," J. Machine Learning Research, vol. 10, pp. 377-403, 2009.
[36] G. Pandey, G. Atluri, M. Steinbach, C.L. Myers, and V. Kumar, "An Association Analysis Approach to Biclustering," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 677-686, 2009.
[37] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, "Discovering Frequent Closed Itemsets for Association Rules," Proc. Int'l Conf. Database Theory (ICDT), pp. 398-416, 1999.
[38] E. Segal, N. Friedman, N. Kaminski, A. Regev, and D. Koller, "From Signatures to Models: Understanding Cancer Using Microarrays," Nature Genetics, vol. 37, pp. S38-S45, 2005.
[39] D. Segre et al., "Modular Epistasis in Yeast Metabolism," Nature Genetics, vol. 37, pp. 77-83, 2004.
[40] J. Shaffer, "Multiple Hypothesis Testing," Ann. Rev. of Psychology, vol. 46, no. 1, pp. 561-584, 1995.
[41] A. Soulet et al., "Condensed Representation of Emerging Patterns," Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), pp. 127-132, 2004.
[42] A. Subramanian et al., "Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles," Proc. Nat'l Academy of Sciences USA, vol. 102, no. 43, pp. 15545-15550, 2005.
[43] P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. Addison-Wesley, 2005.
[44] N. Tatti, "Maximum Entropy Based Significance of Itemsets," Knowledge and Information Systems, vol. 17, no. 1, pp. 57-77, Oct. 2008.
[45] v. Vijver et al., "A Gene-expression Signature as a Predictor of Survival in Breast Cancer," New England J. Medicine, vol. 347, pp. 1999-2009, 2002.
[46] L.J. van t Veer et al., "Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer," Nature, vol. 415, pp. 530-536, 2002.
[47] M. van Vliet, C. Klijn, L. Wessels, and M. Reinders, "Module-Based Outcome Prediction Using Breast Cancer Compendia," PLoS ONE, vol. 2, no. 10, p. 1047, 2007.
[48] K. Verhoeven et al., "Implementing False Discovery Rate Control: Increasing Your Power," Oikos, vol. 108, no. 3, pp. 643-647, 2005.
[49] J. Wang and G. Karypis, "HARMONY: Efficiently Mining the Best Rules for Classification," Proc. SIAM Int'l Data Mining Conf. (SDM), p. 205, 2005.
[50] K. Wang et al., "Pathway-Based Approaches for Analysis of Genomewide Association Studies," Am. J. Human Genetics, vol. 81, no. 6, pp. 1278-1283, 2007.
[51] G.I. Webb et al., "On Detecting Differences between Groups," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 256-265, 2003.
[52] P. Westfall and S. Young, "P Value Adjustments for Multiple Tests in Multivariate Binomial Models," J. Am. Statistical Assoc., vol. 84, pp. 780-786, 1989.
[53] H. Xiong, P. Tan, and V. Kumar, "Hyperclique Pattern Discovery," Data Mining and Knowledge Discovery, vol. 13, no. 2, pp. 219-242, 2006.
[54] X. Yan et al., "Mining Significant Graph Patterns by Leap Search," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 433-444, 2008.
[55] X. Yin and J. Han, "CPAR: Classification Based on Predictive Association Rules," Proc. SIAM Int'l Data Mining Conf. (SDM), pp. 331-335, 2003.
[56] N. Yosef, Z. Yakhini, A. Tsalenko, V. Kristensen, A. Borresen-Dale, E. Ruppin, and R. Sharan, "A Supervised Approach for Identifying Discriminating Genotype Patterns and Its Application to Breast Cancer Data," Bioinformatics, vol. 23, no. 2, pp. 91-98, 2007.

Index Terms:
Association analysis, discriminative pattern mining, biomarker discovery, permutation test.
Citation:
Gang Fang, Gaurav Pandey, Wen Wang, Manish Gupta, Michael Steinbach, Vipin Kumar, "Mining Low-Support Discriminative Patterns from Dense and High-Dimensional Data," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 2, pp. 279-294, Feb. 2012, doi:10.1109/TKDE.2010.241
Usage of this product signifies your acceptance of the Terms of Use.