
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Qiang Cheng, Hongbo Zhou, Jie Cheng, "The FisherMarkov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to HighDimensional Data," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 6, pp. 12171233, June, 2011.  
BibTex  x  
@article{ 10.1109/TPAMI.2010.195, author = {Qiang Cheng and Hongbo Zhou and Jie Cheng}, title = {The FisherMarkov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to HighDimensional Data}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {33}, number = {6}, issn = {01628828}, year = {2011}, pages = {12171233}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2010.195}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  The FisherMarkov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to HighDimensional Data IS  6 SN  01628828 SP1217 EP1233 EPD  12171233 A1  Qiang Cheng, A1  Hongbo Zhou, A1  Jie Cheng, PY  2011 KW  Classification KW  feature subset selection KW  Fisher's linear discriminant analysis KW  highdimensional data KW  kernel KW  Markov random field. VL  33 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
[1] P. Domingos and M. Pazzani, “On the Optimality of the Simple Bayesian Classifier under ZeroOne Loss,” Machine Learning, vol. 29, pp. 103130, 1997.
[2] S. Dudoit, J. Fridlyand, and T. Speed, “Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data,” J. Am. Statistical Assoc., vol. 97, pp. 7787, 2002.
[3] J. Fan and Y. Fan, “High Dimensional Classification Using Features Annealed Independence Rules,” Annals of Statistics, vol. 36, pp. 22322260, 2008.
[4] T.M. Cover, “Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition,” IEEE Trans. Electronic Computers, vol. 14, no. 3, pp. 326334, June 1965.
[5] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, pp. 179188, 1936.
[6] W.J. Dixon and F.J. Massey, Introduction to Statistical Analysis, second ed. McGrawHill, 1957.
[7] M.G. Kendall, A Course in Multivariate Analysis. Griffin, 1957.
[8] P.A. Devijver and J. Kittler, Pattern Recognition: A Statistical Approach. Prentice Hall, 1982.
[9] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Academic Press, 1990.
[10] G.J. McLachlan, Discriminant Analysis and Statistical Pattern Recognition. Wiley, 2004.
[11] K.S. Fu, Sequential Methods in Pattern Recognition and Machine Learning. Academic Press, 1968.
[12] K.S. Fu, P.J. Min, and T.J. Li, “Feature Selection in Pattern Recognition,” IEEE Trans. Systems Science and Cybernetics, vol. 6, no. 1, pp. 3339, Jan. 1970.
[13] C.H. Chen, “On a Class of Computationally Efficient Feature Selection Criteria,” Pattern Recognition, vol. 7, pp. 8794, 1975.
[14] P. Narendra and K. Fukunaga, “A Branch and Bound Algorithm for Feature Subset Selection,” IEEE Trans. Computers, vol. 26, no. 9, pp. 917922, Sept. 1977.
[15] J. Rissanen, Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Company, 1989.
[16] K. Kira and L.A. Rendall, “A Practical Approach to Feature Selection,” Proc. Int'l Conf. Machine Learning, pp. 249256, 1992.
[17] R. Tibshirani, “Regression Shrinkage and Selection via the Lasso,” J. Royol Statistical Soc. Series B: Methodological, vol. 58, pp. 267288, 1996.
[18] D.L. Donoho and M. Elad, “Optimally Sparse Representation in General (Nonorthogonal) Dictionaries via $l_1$ Minimization,” Proc. Nat'l Academy of Sciences USA, vol. 100, pp. 21972202, 2003.
[19] D.L. Donoho, “Compressed Sensing,” IEEE Trans. Information Theory, vol. 52, no. 4, pp. 12891306, Apr. 2006.
[20] E.J. Candes, J. Romberg, and T. Tao, “Stable Signal Recovery from Incomplete and Inaccurate Measurements,” Comm. Pure and Applied Math., vol. 59, pp. 12071223, 2006.
[21] E. Candes and T. Tao, “The Dantzig Selector: Statistical Estimation When p Is Much Larger Than n,” Annals of Statistics, vol. 35, no. 6, pp. 23132351, 2007.
[22] G.J. McLachlan, R.W. Bean, and D. Peel, “A Mixture ModelBased Approach to the Clustering of Microarray Expression Data,” Bioinformatics, vol. 18, pp. 413422, 2002.
[23] H. Peng, F. Long, and C. Ding, “Feature Selection Based on Mutual Information: Criteria of MaxDependency, MaxRelevance, and MinRedundancy,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 12261238, Aug. 2005.
[24] L. Wang, “Feature Selection with Kernel Class Separability,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp. 15341546, Sept. 2008.
[25] J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik, “Feature Selection for SVMs,” Advances in Neural Information Processing Systems, T.K. Leen, T.G. Dietterich, and V. Tresp, eds., pp. 668674, MIT Press, 2000.
[26] I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene Selection for Cancer Classification Using Support Vector Machines,” Machine Learning, vol. 46, nos. 13, pp. 389422, 2002.
[27] A. Webb, Statistical Pattern Recognition, second ed. Wiley, 2002.
[28] H. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, 1999.
[29] L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 532, 2001.
[30] D. Koller and M. Sahami, “Toward Optimal Feature and Subset Selection Problem,” Proc. Int'l Conf. Machine Learning, pp. 284292, 1996.
[31] E.B. Fowlkes, R. Gnanadesikan, and J.R. Kettenring, “Variable Selection in Clustering and Other Contexts,” Design, Data, and Analysis, C.L. Mallows, ed., pp. 1334, Wiley, 1987.
[32] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, second ed. WileyInterscience, 2000.
[33] P. Bickel and E. Levina, “Some Theory of Fisher's Linear Discriminant Function, ‘Naive Bayes,’ and Some Alternatives Where There Are Many More Variables Than Observations,” Bernoulli, vol. 10, pp. 9891010, 2004.
[34] S. Mika, G. Ratsch, and K.R. Muller, “A Mathematical Programming Approach to the Kernel Fisher Algorithm,” Advances in Neural Information Processing Systems, vol. 13, pp. 591597, MIT Press, 2001.
[35] V.N. Vapnik, Statistical Learning Theory. Wiley, 1998.
[36] B. Scholkopf and A.J. Smola, Learning with Kernels. MIT Press, 2002.
[37] V.N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1999.
[38] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, 2004.
[39] S. Fidler, D. Slocaj, and A. Leonardis, “Combining Reconstructive and Discriminative Subspace Methods for Robust Classification and Regression by Subsampling,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 3, pp. 337350, Mar. 2006.
[40] H. Akaike, “A New Look at the Statistical Model Identification,” IEEE Trans. Automatic Control, vol. 19, no. 6, pp. 716723, Dec. 1974.
[41] G. Schwarz, “Estimating the Dimension of a Model,” Annals of Statistics, vol. 6, pp. 361379, 1978.
[42] D.P. Foster and E.I. George, “The Risk Inflation Criterion for Multiple Regression,” Annals of Statistics, vol. 22, pp. 19471975, 1994.
[43] J. Weston, A. Elisseeff, B. Schlkopf, and M.E. Tipping, “Use of the ZeroNorm with Linear Models and Kernel Methods,” J. Machine Learning Research, vol. 3, pp. 14391461, 2003.
[44] P.E. Greenwood and A.N. Shiryayev, Contiguity and the Statistical Invariance Principle. Gordon and Breach, 1985.
[45] M. Rosenblatt, Gaussian and NonGaussian Linear Time Series and Random Fields. Springer, 2000.
[46] D. Bosq, Nonparametric Statistics for Stochastic Processes. Springer, 1998.
[47] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721741, Nov. 1984.
[48] G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, third ed. SpringerVerlag, 2006.
[49] S. Dai, S. Baker, and S.B. Kang, “An MRFBased Deinterlacing Algorithm with ExemplarBased Refinement,” IEEE Trans. Image Processing, vol. 18, no. 5, pp. 956968, May 2009.
[50] D.S. Hochbaum, “An Efficient Algorithm for Image Segmentation, Markov Random Fields and Related Problems,” J. ACM, vol. 48, no. 2, pp. 686701, 2001.
[51] J.P. Picard and H.D. Ratliff, “Minimum Cuts and Related Problem,” Networks, vol. 5, pp. 357370, 1975.
[52] H. Ishikawa, “Exact Optimization for Markov Random Fields with Convex Priors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 13331336, Oct. 2003.
[53] V. Kolmogorov and R. Zabih, “What Energy Can Be Minimized via Graph Cuts?” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 147159, Feb. 2004.
[54] Y. Boykov, O. Veksler, and R. Zabih, “Fast Approximate Energy Minimization via Graph Cuts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 12221239, Nov. 2001.
[55] M. Wainwright, T. Jaakkola, and A. Willsky, “MAP Estimation via Agreement on (Hyper)Trees: MessagePassing and Linear Programming,” IEEE Trans. Information Theory, vol. 51, no. 11, pp. 36973717, Nov. 2005.
[56] J. Yedidia, W. Freeman, and Y. Weiss, “Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms,” IEEE Trans. Information Theory, vol. 51, no. 7, pp. 22822312, July 2004.
[57] J. Demsar, “Statistical Comparisons of Classifiers over Multiple Data Sets,” J. Machine Learning Research, vol. 7, pp. 130, 2006.
[58] C.L. Blake, D.J. Newman, S. Hettich, and C.J. Merz, UCI Repository of Machine Learning Databases, http://www.ics. uci.edu/mlearnMLRepository.html , 1998.
[59] M.D. Garris, et al., NIST FormBased Handprint Recognition System, NISTIR 5469, 1994.
[60] T. Golub et al., “Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring,” Science, vol. 286, pp. 531537, http://www.broad.mit.edu/cgibin/cancerdatasets.cgi , 1999.
[61] D.T. Ross et al., “Systematic Variation in Gene Expression Patterns in Human Cancer Cell Lines,” Nature Genetics, vol. 24, no. 3, pp. 227234, 2000.
[62] U. Scherf et al., “A cDNA Microarray Gene Expression Database for the Molecular Pharmacology of Cancer,” Nature Genetics, vol. 24, no. 3, pp. 236244, 2000.
[63] D. Singh et al., “Gene Expression Correlates of Clinical Prostate Cancer Behavior,” Cancer Cell, vol. 1, pp. 203209, http://www.broad.mit.edu/cgibin/cancerdatasets.cgi , 2002.
[64] J.B. Welsh et al., “Analysis of Gene Expression Identifies Candidate Markers and Pharmacological Targets in Prostate Cancer,” Cancer Research, vol. 61, pp. 59745978, 2001.