
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Nicola Giusti, Francesco Masulli, Alessandro Sperduti, "Theoretical and Experimental Analysis of a TwoStage System for Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 893904, July, 2002.  
BibTex  x  
@article{ 10.1109/TPAMI.2002.1017617, author = {Nicola Giusti and Francesco Masulli and Alessandro Sperduti}, title = {Theoretical and Experimental Analysis of a TwoStage System for Classification}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {24}, number = {7}, issn = {01628828}, year = {2002}, pages = {893904}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2002.1017617}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Theoretical and Experimental Analysis of a TwoStage System for Classification IS  7 SN  01628828 SP893 EP904 EPD  893904 A1  Nicola Giusti, A1  Francesco Masulli, A1  Alessandro Sperduti, PY  2002 KW  Multicategory classification KW  rejection KW  global and local classification KW  hierarchical classifier KW  Bayes classifier. VL  24 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
We consider a popular approach to multicategory classification tasks: a twostage system based on a first (global) classifier with rejection followed by a (local) nearestneighbor classifier. Patterns which are not rejected by the first classifier are classified according to its output. Rejected patterns are passed to the nearestneighbor classifier together with the {\rm{top}}\hbox{}h ranking classes returned by the first classifier. The nearestneighbor classifier, looking at patterns in the {\rm{top}}\hbox{}h classes, classifies the rejected pattern. An editing strategy for the nearestneighbor reference database, controlled by the first classifier, is also considered. We analyze this system, showing that even if the first level and nearestneighbor classifiers are not optimal in a Bayes sense, the system as a whole may be optimal. Moreover, we formally relate the response time of the system to the rejection rate of the first classifier and to the other system parameters. The errorresponse time tradeoff is also discussed. Finally, we experimentally study two instances of the system applied to the recognition of handwritten digits. In one system, the first classifier is a fuzzy basis functions network, while in the second system it is a feedforward neural network. Classification results as well as response times for different settings of the system parameters are reported for both systems.
[1] D. Alfonso, F. Masulli, and A. Sperduti, “Competitive Learning in a Classifier Based on an Adaptive Fuzzy System,” Proc. Int'l ICSC Symp. Industrial Intelligent Automation (IIA '96) and Soft Computing (SOCO '96), P.G. Anderson and K. Warwick, eds., pp. 28, 1996.
[2] L. Bottou and V. Vapnik, “Local Learning Algorithms,” Neural Computation, vol. 4, no. 6, pp. 888900, 1992.
[3] F. Casalino, F. Masulli, and A. Sperduti, “Rule Specialization in Networks of Fuzzy Basis Functions,” Intelligent Automation and Soft Computing, vol. 4, pp. 7382, 1998.
[4] C.K. Chow, "On Optimum Recognition Error and Reject TradeOff," IEEE Trans. Information Theory, vol. 16, no. 1, pp. 4146, 1970.
[5] L.S. Larkey and W.B. Croft, “Combining Classifiers in Text Categorization,” Proc. 19th Int'l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 289–297, 1996.
[6] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[7] C. Furlanello, D. Giuliani, E. Trentin, and S. Merler., “Speaker Normalization and Model Selection of Combined Neural NetWorks,“ Connection Science, vol. 9, no. 1, pp. 3150, 1997.
[8] M.D. Garris and R.A. Wilkinson, NIST Special Database3 Handwritten Segmented Characters. Gaithesburg, Md.: Nat'l Inst. of Standard and Tech nology, 1992.
[9] S. Gutta and H. Wechsler, “Gender Classification of Human Faces Using Hybrid Classifier Systems,” Proc. Int'l Conf. Neural Networks, vol. 3, pp. 13531358, 1997.
[10] S. Hashem, “Effects of Collinearity on Combining Neural Networks,” Connection Science, vol. 8, nos. 3 and 4, pp. 315336, 1996.
[11] D. Jimenez and N. Walsh, “Dynamically Weighted Ensemble Neural Networks for Classification,” Proc. 1998 Int'l Joint Conf. Neural Networks, pp. 753758, 1998.
[12] H.M. Kim and J.M. Mendel, “Fuzzy Basis Functions: Comparisons with Other Basis Functions,” IEEE Trans. Fuzzy Systems, vol. 3, pp. 158168, 1995.
[13] S. Knerr and A. Sperduti, “Rejection Driven Hierarchy of Neural Network Classifiers,” Proc. Int'l Symp. Nonlinear Theory and Its Applications `93, pp. 957961, 1993.
[14] M.W. Kurzynski, “On the Multistage Bayes Classifier,” Pattern Recognition, vol. 22, no. 4, pp. 355365, 1988.
[15] C.C. Lee, "Fuzzy Logic in Control Systems: Fuzzy Logic Controller, Parts I and II," IEEE Trans. Systems, Man, and Cybernetics, Vol. 20, No. 2, 1990, pp. 404435.
[16] Y. Liu and X. Yao, “A Cooperative Ensemble Learning System,” Proc. 1998 IEEE Int'l Joint Conf. Neural Networks (IJCNN '98), pp. 22022207, 1998.
[17] R. Maclin and D. Opitz, “An Empirical Evaluation of Bagging and Boosting,” Proc. 14th Nat'l Conf. Artificial Intelligence and Proc. Ninth Innovative Applications of Artificial Intelligence Conf. (AAAI '97/IAAI '97), pp. 546551, pp. 2731, July 1997.
[18] F. Masulli, F. Casalino, and F. Vannucci, “Bayesian Properties and Performances of Adaptive Fuzzy Systems in Pattern Recognition Problems,” Proc. European Conf. Artificial Neural Networks, (ICANN '94) M. Marinaro and P.G. Morasso, eds., pp. 189192, 1994.
[19] J.M. Mendel, “Fuzzy Logic Systems for Engineering: A Tutorial,” Proc. IEEE, vol. 83, pp. 345377, 1995.
[20] D.W. Opitz and J.W. Shavlik, “Actively Searching for an Effective Neural Network Ensemble,” Connection Science, vol. 8, nos. 3 and 4, 1996.
[21] B. Parmanto, P.W. Munro, and H.R. Doyle, “Reducing Variance of Commettee Prediction with Resampling Techniques,” Connection Science, vol. 8, nos. 3 and 4, 1996.
[22] P. Poddar and P.V.S. Rao, “Hierarchical Ensemble of Neural Networks,” Proc. Int'l Conf. Neural Networks, vol. 1, pp. 287292, 1993.
[23] P. Pudil, J. Novovicova, S. Blaha, and J. Kittler, Multistage Pattern Recognition with Reject Option Proc. Int'l Conf. Pattern Recognition Methodology and Systems, pp. 9295, 1992.
[24] C. Rodríguez, J. Muguerza, M. Navarro, A. Zárate, J.I. Martín, and J.M. Pérez, “A TwoStage Classifier for Broken and Blurred Digits in Forms,” Proc. 14th Int'l Conf. Pattern Recognition (ICPR '98), pp. 1,1011,105, Aug. 1998.
[25] J. Rokui and H. Shimodaira, “Multistage Building Learning Based on Misclassification Measure,” Proc. Int'l Conf. Artificial Neural Networks, pp. 221226, 1999.
[26] B.E. Rosen, “Ensemble Learning Using Decorrelated Neural Networks,” Connection Science, vol. 8, nos. 3 and 4, pp. 373384, 1996.
[27] R.E. Schapire and Y. Singer, “Improved Boosting Algorithms Using ConfidenceRated Predictions,” Proc. Ann. Conf. Computer Learning Theory '98, pp. 8091, 1998.
[28] A.J.C. Sharkey, “On Combining Artificial Neural Nets,” Connection Science, vol. 8, nos. 3 and 4, pp. 299314, 1996.
[29] A.J.C. Sharkey, “Modularity, Combining and Artificial Neural Nets,” Connection Science, vol. 9, no. 1, pp. 310, 1997.
[30] S. Simon, H.A. Kestler, A. Baune, F. Schwenker, and G. Palm, “Object Classification with Simple Visual Attention and a Hierarchical Neural Network for SubsymbolicSymbolic Coupling,” Proc. IEEE Int'l Symp. Computational Intelligence in Robotics and Automation, pp. 244249, 1999.
[31] S.K. Tso, X.P. Gu, Q.Y. Zeng, and K.L. Lo, “Input Space Decomposition and Multilevel Classification Approach for ANNBased Transient Security Assessment,” Proc. Fourth Int'l Conf. Advances in Power System Control, Operation and Management, vol. 2, pp. 499504, 1997.
[32] K. Tumer and J. Ghosh, “Error Correlation and Error Reduction in Ensemble Classifiers,” Connection Science, vol. 8, nos. 3 and 4, 1996.
[33] V.N. Vapnik, Statistical Learning Theory, John Wiley&Sons, 1998.
[34] L.X. Wang and J.M. Mendel, “Fuzzy Basis Functions, Universal Approximation, and Orthogonal LeastSquares Learning,” IEEE Trans. Neural Networks, Vol. 3, No. 5, 1992, pp. 807814.
[35] L.X. Wang and J.M. Mendel, Generating Fuzzy Rules by Learning From Examples IEEE Trans. System, Man, and Cybernetics, vol. 22, no. 6, pp. 14141427, Dec. 1992.