This Article 
 Bibliographic References 
 Add to: 
The Multiscale Classifier
February 1996 (vol. 18 no. 2)
pp. 124-137

Abstract—In this paper we propose a rule-based inductive learning algorithm called Multiscale Classification (MSC). It can be applied to any N-dimensional real or binary classification problem to classify the training data by successively splitting the feature space in half. The algorithm has several significant differences from existing rule-based approaches: learning is incremental, the tree is non-binary, and backtracking of decisions is possible to some extent.

The paper first provides background on current machine learning techniques and outlines some of their strengths and weaknesses. It then describes the MSC algorithm and compares it to other inductive learning algorithms with particular reference to ID3, C4.5, and back-propagation neural networks. Its performance on a number of standard benchmark problems is then discussed and related to standard learning issues such as generalization, representational power, and over-specialization.

[1] M. Anthony and N. Biggs, Computational Learning Theory : An Introduction. Cambridge Univ. Press, 1992.
[2] R.E. Blahut, Digital Transmission of Information.Reading, Mass: Addison-Wesley, pp. 306-313, 1990.
[3] A. Blumer, A. Ehrenfeucht, D. Haussler, and M. Warmuth, "Occam's Razor," Information Processing Letters, vol. 24, NorthHolland, pp. 377-380, 1987.
[4] L. Breiman, J. Freidman, R. Olshen, and C. Stone, Classification and Regression Trees.Belmont: Wadsworth, 1984.
[5] R. Burnett, "Theory and application of the multiscale classifier," Dept. Electrical and Computer Eng., Univ. of Queensland, honors thesis, 1994.
[6] I. Cestnik, I. Kononenko, and I. Bratko, "Assistant 86: A knowledge-elicitation tool for sophisticated users," Machine Learning, I. Bratco and N. Lavrac, eds. Wilmslow: Sigma Press, 1987.
[7] I. Cestnik and I. Bratko, "On estimating probabilities in tree pruning," Machine Learning: EWSL-91: European Working Session on Learning, Y. Kodratoff, ed., Porto, Portugal: Lecture Notes, in Artificial Intelligence, vol. 482, pp. 138-150, 1991.
[8] S.E. Fahlman and C. Lebiere, "The Cascade-Correlation Learning Architecture," in Advances in Neural Information Processing Systems 2, D.S. Touretzky, ed., Morgan Kaufmann, San Mateo, Calif., 1990, pp. 524-532.
[9] R. Fisher, "The use of multiple measurements in taxinomic problems," Annals of Eugenics, vol. 7, pp. 179-188, 1936.
[10] R. Forsyth and R. Rada, Machine Learning: Applications in Expert Systems and Information Retrieval,West Sussex: Ellis Horwood Limited, 1986.
[11] A.D. Gordon, Classification. Chapman and Hall, 1981.
[12] D.J. Hand, Discrimination and Classification. John Wiley&Sons, 1981.
[13] R.C. Holte, “Very Simple Classification Rules Perform Well on Most Commonly Used Datasets,” Machine Learning, vol. 11, pp. 63–91, 1993.
[14] U. Knoll, G. Nakhaeizadeh, and B. Tausend, "Cost sensitive pruning of decision trees," Machine Learning: Proc. ECML, vol. 94, pp. 383-386, 1994.
[15] K.J. Lang and M.J. Witbrock, "Learning to tell two spirals apart," Proc. 1988 Connectionist Summer School. Morgan Kaufman, 1988.
[16] J.L. McClelland and D.E. Rumelhart, Explorations in Parallel Distributed Processing.Cambridge, Mass.: MIT Press, 1986.
[17] D. Michie and R. Chambers, "Boxes: An experiment in adaptive control," Machine Inteligence 2, Dale and Michie, eds., Edinburgh Univ. Press, 1968.
[18] J. Mingers, “An Empirical Comparison of Pruning Methods for Decision Tree Induction,” Machine Learning, vol. 4, no. 2, pp. 227-243, 1989.
[19] M.L. Minsky and S.A. Papert in Perceptrons.Cambridge, Mass: MIT Press, 1969.
[20] J.R. Quinlan, "Semi-autonomous acquisition of pattern-based knowledge," Introductory Readings in Expert Systems, D. Michie, ed., , Gordon and Breach, 1982.
[21] J.R. Quinlan,“Simplifying decision trees,” Int’l J. Man-Machine Studies, vol. 27, pp. 221-234, 1987.
[22] J.R. Quinlan,P.J. Compton,K.A. Horn, , and L. Lazarus,“Inductive knowledge acquisition: a case study,” Applications of Expert Systems, J.R. Quinlan, ed., pp. 157-173,Sydney, Australia: Addition-Wesley, 1987.
[23] J.R. Quinlan, C4.5: Programs for Machine Learning,San Mateo, Calif.: Morgan Kaufman, 1992.
[24] F. Rosenblatt, "The perceptron: A probabilistic model for information storage and organization in the brain," Psychology Review, vol. 65, pp. 386-408, 1958.
[25] H. Samet,“The quadtree and related hierarchical data structures,” ACM Computing Surveys, vol. 16, no. 2, pp. 187-260, June 1984.
[26] A. Samuel, "Some studies in machine learning using the game of checkers, part II," IBM J. Research and Development, vol. 11, no. 4, pp. 601-618, 1967.
[27] J.W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler, and R. S. Johannes, "Using the (ADAP) learning algorithm to forecast the onset of diabetes mellitus," Proc. Symp Computer Applications and Medical Care. IEEE Computer Society Press, pp. 261-265, 1988.
[28] S.B. Thrun, "The MONK's problem: A performance comparison of different learning algorithms," Carnegie-Mellon Univ., tech. report, 1991
[29] R.F. Tinder, Digital Engineering Design: A Modern Approach.Englewood Cliffs, N.J.: Prentice Hall, 1991.
[30] S. Weiss and C. Kulikowski, Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems, Morgan Kaufmann, 1991.
[31] W.H. Wolberg and O.L. Mangasarian, "Multisurface method of pattern separation for medical diagnosis applied to breast cytology," Proc. Nat'l Academy of Sciences, vol. 87, no. 12, pp. 9,193-9,196, 1990.

Index Terms:
Multiscale classification, decision trees, inductive machine learning, tree pruning.
Brian C. Lovell, Andrew P. Bradley, "The Multiscale Classifier," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 2, pp. 124-137, Feb. 1996, doi:10.1109/34.481538
Usage of this product signifies your acceptance of the Terms of Use.