This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Complexity Measures of Supervised Classification Problems
March 2002 (vol. 24 no. 3)
pp. 289-300

We studied a number of measures that characterize the difficulty of a classification problem, focusing on the geometrical complexity of the class boundary. We compared a set of real-world problems to random labelings of points and found that real problems contain structures in this measurement space that are significantly different from the random sets. Distributions of problems in this space show that there exist at least two independent factors affecting a problem's difficulty. We suggest using this space to describe a classifier's domain of competence. This can guide static and dynamic selection of classifiers for specific problems as well as subproblems formed by confinement, projection, and transformations of the feature vectors.

[1] M. Basu and T.K. Ho, “The Learning Behavior of Single Neuron Classifiers on Linearly Separable or Nonseparable Input,” Proc. 1999 Int'l Joint Conf. Neural Networks, July 1999.
[2] R. Berlind, “An Alternative Method of Stochastic Discrimination with Applications to Pattern Recognition,” PhD thesis, SUNY/Buffalo, New York, 1994.
[3] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” Univ. of California, Dept. Information and Computer Science, Irvine, CA,http://www.ics.uci.edu/mlearnMLRepository.html , 1998.
[4] D.S. Broomhead, R. Jones, and G.P. King, “Topological Dimension and Local Coordinates,” J. Physics, A: Mathematical and General, vol. 20, no. 6, pp. L563-L569, 1987.
[5] G.J. Chaitin, “A Theory of Program Size Formally Identical to Information Theory,” J. ACM, vol. 22, pp. 329-340, 1975.
[6] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Ann. Eugenics, vol. 7,Part II, pp. 179-188, 1936.
[7] J.H. Friedman and L.C. Rafsky, “Multivariate Generalizations of the Wald-Wolfowitz and Smirnov Two-Sample Tests,” The Annals of Statistics, vol. 7, no. 4, pp. 697-717, 1979.
[8] M. Gell-Mann, “What is Complexity?” Complexity, vol. 1, no. 1, pp. 16-19, 1995.
[9] T.K. Ho, “Complexity of Classification Problems and Comparative Advantages of Combined Classifiers,” Proc. First Int'l Workshop Multiple Classifier Systems, pp. 97-106, 2000.
[10] T.K. Ho, “Multiple Classifier Combination: Lessons and Next Steps,” Hybrid Methods in Pattern Recognition, A. Kandel and H. Bunke, eds., World Scientific, 2002.
[11] T.K. Ho and H.S. Baird, “Large-Scale Simulation Studies in Image Pattern Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 10, pp. 1,067-1,079, Oct. 1997.
[12] T.K. Ho and H.S. Baird, “Pattern Classification with Compact Distribution Maps,” Computer Vision and Image Understanding, vol. 70, no. 1, pp. 101-110, Apr. 1998.
[13] T.K. Ho and M. Basu, “Measuring the Complexity of Classification Problems,” Proc. 15th Int'l Conf. Pattern Recognition, vol. 2, pp. 43-47, 2000.
[14] A. Hoekstra and R.P.W. Duin, “On the Nonlinearity of Pattern Classifiers,” Proc. 13th Int'l Conf. Pattern Recogntion, pp. 271-275, Aug. 1996.
[15] A.K. Jain, R.P.W. Duin, and J. Mao, Statistical Pattern Recognition: A Review IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, Jan. 2000.
[16] E.M. Kleinberg, “An Overtraining-Resistant Stochastic Modeling Method for Pattern Recognition,” Annals of Statistics, vol. 4, no. 6, pp. 2319-2349, Dec. 1996.
[17] A.N. Kolmogorov, “Three Approaches to the Quantitative Definition of Information,” Problems of Information Transmission, vol. 1, pp. 4-7, 1965.
[18] F. Lebourgeois and H. Emptoz, “Pretopological Approach for Supervised Learning,” Proc. 13th Int'l Conf. Pattern Recognition, pp. 256-260, 1996.
[19] M. Li and P. Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, 1993.
[20] J.M. Maciejowski, “Model Discrimination Using an Algorithmic Information Criterion,” Automatica, vol. 15, pp. 579-593, 1979.
[21] Machine Learning, Neural and Statistical Classification, D. Michie, D.J. Spiegelhalter, and C.C.Taylor, eds. Ellis Horwood, 1994.
[22] S.J. Raudys and A.K. Jain, "Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 252-264, 1991.
[23] S.Y. Sohn, “Meta Analysis of Classification Algorithms for Pattern Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 11, pp. 1137-1144, 1999.
[24] F.W. Smith, “Pattern Classifier Design by Linear Programming,” IEEE Trans. Computers, vol. 17, no. 4, pp. 367-372, Apr. 1968.
[25] S.P. Smith and A.K. Jain, “A Test to Determine the Multivariate Normality of a Data Set,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 5, pp. 757-761, Sept. 1988.
[26] V.N. Vapnik, Statistical Learning Theory, John Wiley&Sons, 1998.
[27] P.J. Verveer and R.P.W. Duin, “An Evaluation of Intrinsic Dimensionality Estimators,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 1, pp. 81-86, Jan. 1995.
[28] N. Wyse, R. Dubes, and A.K. Jain, “A Critical Evaluation of Intrinsic Dimensionality Algorithms,” Pattern Recognition in Practice, E.S. Gelsema and L.N. Kanal, eds., North-Holland, pp. 415-425, 1980.

Index Terms:
classification, clustering, complexity, linear separability, mixture identifiability
Citation:
T.K. Ho, M. Basu, "Complexity Measures of Supervised Classification Problems," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 289-300, March 2002, doi:10.1109/34.990132
Usage of this product signifies your acceptance of the Terms of Use.