This Article 
 Bibliographic References 
 Add to: 
Theoretical and Experimental Analysis of a Two-Stage System for Classification
July 2002 (vol. 24 no. 7)
pp. 893-904

We consider a popular approach to multicategory classification tasks: a two-stage system based on a first (global) classifier with rejection followed by a (local) nearest-neighbor classifier. Patterns which are not rejected by the first classifier are classified according to its output. Rejected patterns are passed to the nearest-neighbor classifier together with the {\rm{top}}\hbox{-}h ranking classes returned by the first classifier. The nearest-neighbor classifier, looking at patterns in the {\rm{top}}\hbox{-}h classes, classifies the rejected pattern. An editing strategy for the nearest-neighbor reference database, controlled by the first classifier, is also considered. We analyze this system, showing that even if the first level and nearest-neighbor classifiers are not optimal in a Bayes sense, the system as a whole may be optimal. Moreover, we formally relate the response time of the system to the rejection rate of the first classifier and to the other system parameters. The error-response time trade-off is also discussed. Finally, we experimentally study two instances of the system applied to the recognition of handwritten digits. In one system, the first classifier is a fuzzy basis functions network, while in the second system it is a feed-forward neural network. Classification results as well as response times for different settings of the system parameters are reported for both systems.

[1] D. Alfonso, F. Masulli, and A. Sperduti, “Competitive Learning in a Classifier Based on an Adaptive Fuzzy System,” Proc. Int'l ICSC Symp. Industrial Intelligent Automation (IIA '96) and Soft Computing (SOCO '96), P.G. Anderson and K. Warwick, eds., pp. 2-8, 1996.
[2] L. Bottou and V. Vapnik, “Local Learning Algorithms,” Neural Computation, vol. 4, no. 6, pp. 888-900, 1992.
[3] F. Casalino, F. Masulli, and A. Sperduti, “Rule Specialization in Networks of Fuzzy Basis Functions,” Intelligent Automation and Soft Computing, vol. 4, pp. 73-82, 1998.
[4] C.K. Chow, "On Optimum Recognition Error and Reject TradeOff," IEEE Trans. Information Theory, vol. 16, no. 1, pp. 41-46, 1970.
[5] L.S. Larkey and W.B. Croft, “Combining Classifiers in Text Categorization,” Proc. 19th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 289–297, 1996.
[6] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[7] C. Furlanello, D. Giuliani, E. Trentin, and S. Merler., “Speaker Normalization and Model Selection of Combined Neural NetWorks,“ Connection Science, vol. 9, no. 1, pp. 31-50, 1997.
[8] M.D. Garris and R.A. Wilkinson, NIST Special Database3 Handwritten Segmented Characters. Gaithesburg, Md.: Nat'l Inst. of Standard and Tech nology, 1992.
[9] S. Gutta and H. Wechsler, “Gender Classification of Human Faces Using Hybrid Classifier Systems,” Proc. Int'l Conf. Neural Networks, vol. 3, pp. 1353-1358, 1997.
[10] S. Hashem, “Effects of Collinearity on Combining Neural Networks,” Connection Science, vol. 8, nos. 3 and 4, pp. 315-336, 1996.
[11] D. Jimenez and N. Walsh, “Dynamically Weighted Ensemble Neural Networks for Classification,” Proc. 1998 Int'l Joint Conf. Neural Networks, pp. 753-758, 1998.
[12] H.M. Kim and J.M. Mendel, “Fuzzy Basis Functions: Comparisons with Other Basis Functions,” IEEE Trans. Fuzzy Systems, vol. 3, pp. 158-168, 1995.
[13] S. Knerr and A. Sperduti, “Rejection Driven Hierarchy of Neural Network Classifiers,” Proc. Int'l Symp. Nonlinear Theory and Its Applications `93, pp. 957-961, 1993.
[14] M.W. Kurzynski, “On the Multistage Bayes Classifier,” Pattern Recognition, vol. 22, no. 4, pp. 355-365, 1988.
[15] C.C. Lee, "Fuzzy Logic in Control Systems: Fuzzy Logic Controller, Parts I and II," IEEE Trans. Systems, Man, and Cybernetics, Vol. 20, No. 2, 1990, pp. 404-435.
[16] Y. Liu and X. Yao, “A Cooperative Ensemble Learning System,” Proc. 1998 IEEE Int'l Joint Conf. Neural Networks (IJCNN '98), pp. 2202-2207, 1998.
[17] R. Maclin and D. Opitz, “An Empirical Evaluation of Bagging and Boosting,” Proc. 14th Nat'l Conf. Artificial Intelligence and Proc. Ninth Innovative Applications of Artificial Intelligence Conf. (AAAI '97/IAAI '97), pp. 546-551, pp. 27-31, July 1997.
[18] F. Masulli, F. Casalino, and F. Vannucci, “Bayesian Properties and Performances of Adaptive Fuzzy Systems in Pattern Recognition Problems,” Proc. European Conf. Artificial Neural Networks, (ICANN '94) M. Marinaro and P.G. Morasso, eds., pp. 189-192, 1994.
[19] J.M. Mendel, “Fuzzy Logic Systems for Engineering: A Tutorial,” Proc. IEEE, vol. 83, pp. 345-377, 1995.
[20] D.W. Opitz and J.W. Shavlik, “Actively Searching for an Effective Neural Network Ensemble,” Connection Science, vol. 8, nos. 3 and 4, 1996.
[21] B. Parmanto, P.W. Munro, and H.R. Doyle, “Reducing Variance of Commettee Prediction with Resampling Techniques,” Connection Science, vol. 8, nos. 3 and 4, 1996.
[22] P. Poddar and P.V.S. Rao, “Hierarchical Ensemble of Neural Networks,” Proc. Int'l Conf. Neural Networks, vol. 1, pp. 287-292, 1993.
[23] P. Pudil, J. Novovicova, S. Blaha, and J. Kittler, Multistage Pattern Recognition with Reject Option Proc. Int'l Conf. Pattern Recognition Methodology and Systems, pp. 92-95, 1992.
[24] C. Rodríguez, J. Muguerza, M. Navarro, A. Zárate, J.I. Martín, and J.M. Pérez, “A Two-Stage Classifier for Broken and Blurred Digits in Forms,” Proc. 14th Int'l Conf. Pattern Recognition (ICPR '98), pp. 1,101-1,105, Aug. 1998.
[25] J. Rokui and H. Shimodaira, “Multistage Building Learning Based on Misclassification Measure,” Proc. Int'l Conf. Artificial Neural Networks, pp. 221-226, 1999.
[26] B.E. Rosen, “Ensemble Learning Using Decorrelated Neural Networks,” Connection Science, vol. 8, nos. 3 and 4, pp. 373-384, 1996.
[27] R.E. Schapire and Y. Singer, “Improved Boosting Algorithms Using Confidence-Rated Predictions,” Proc. Ann. Conf. Computer Learning Theory '98, pp. 80-91, 1998.
[28] A.J.C. Sharkey, “On Combining Artificial Neural Nets,” Connection Science, vol. 8, nos. 3 and 4, pp. 299-314, 1996.
[29] A.J.C. Sharkey, “Modularity, Combining and Artificial Neural Nets,” Connection Science, vol. 9, no. 1, pp. 3-10, 1997.
[30] S. Simon, H.A. Kestler, A. Baune, F. Schwenker, and G. Palm, “Object Classification with Simple Visual Attention and a Hierarchical Neural Network for Subsymbolic-Symbolic Coupling,” Proc. IEEE Int'l Symp. Computational Intelligence in Robotics and Automation, pp. 244-249, 1999.
[31] S.K. Tso, X.P. Gu, Q.Y. Zeng, and K.L. Lo, “Input Space Decomposition and Multilevel Classification Approach for ANN-Based Transient Security Assessment,” Proc. Fourth Int'l Conf. Advances in Power System Control, Operation and Management, vol. 2, pp. 499-504, 1997.
[32] K. Tumer and J. Ghosh, “Error Correlation and Error Reduction in Ensemble Classifiers,” Connection Science, vol. 8, nos. 3 and 4, 1996.
[33] V.N. Vapnik, Statistical Learning Theory, John Wiley&Sons, 1998.
[34] L.X. Wang and J.M. Mendel, “Fuzzy Basis Functions, Universal Approximation, and Orthogonal Least-Squares Learning,” IEEE Trans. Neural Networks, Vol. 3, No. 5, 1992, pp. 807-814.
[35] L.X. Wang and J.M. Mendel, Generating Fuzzy Rules by Learning From Examples IEEE Trans. System, Man, and Cybernetics, vol. 22, no. 6, pp. 1414-1427, Dec. 1992.

Index Terms:
Multicategory classification, rejection, global and local classification, hierarchical classifier, Bayes classifier.
Nicola Giusti, Francesco Masulli, Alessandro Sperduti, "Theoretical and Experimental Analysis of a Two-Stage System for Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 893-904, July 2002, doi:10.1109/TPAMI.2002.1017617
Usage of this product signifies your acceptance of the Terms of Use.