This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Generalized Risk Zone: Selecting Observations for Classification
July 2009 (vol. 31 no. 7)
pp. 1331-1337
R.T. Peres, Federal University of Rio de Janeiro (UFRJ) , Rio de Janeiro
C.E. Pedreira, Federal University of Rio de Janeiro UFRJ), Rio de Janeiro
In this paper, we extend the risk zone concept by creating the Generalized Risk Zone. The Generalized Risk Zone is a model-independent scheme to select key observations in a sample set. The observations belonging to the Generalized Risk Zone have shown comparable, in some experiments even better, classification performance when compared to the use of the whole sample. The main tool that allows this extension is the Cauchy-Schwartz divergence, used as a measure of dissimilarity between probability densities. To overcome the setback concerning pdf's estimation, we used the ideas provided by the Information Theoretic Learning, allowing the calculation to be performed on the available observations only. We used the proposed methodology with Learning Vector Quantization, feedforward Neural Networks, Support Vector Machines, and Nearest Neighbors.

[1] C.E. Pedreira, “Learning Vector Quantization with Training Data Selection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 1, pp.157-162, Jan. 2006.
[2] J.C. Principe, D. Xu, and J. Fisher, “Information Theoretic Learning,” Unsupervised Adaptive Filtering, S. Haykin, ed., Wiley, 2000.
[3] R.O. Duda, P.E. Hart, and G. Stork, Pattern Recognition, second ed. Wiley, 2001.
[4] R. Jensen, “An Information Theoretic Approach to Machine Learning,” Doctor scientiarum dissertation, Faculty of Science, Dept. of Physics, Univ. of Tromso, 2005.
[5] R. Detrano, A. Janosi, W. Steinbrunn, M. Pfisterer, J. Schmid, S. Sandhu, K. Guppy, S. Lee, and V. Froelicher, “International Application of a New Probability Algorithm for the Diagnosis of Coronary Artery Disease,” Am. J. Cardiology, pp.304-310, 1989.
[6] V.N. Vapnik, Statistical Learning Theory. Wiley, 1998.
[7] C.J.C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Knowledge Discovery and Data Mining, vol. 2, no. 2, pp.121-167, 1998.
[8] M. Plutowski and H. White, “Selecting Concise Training Sets from Clean Data,” IEEE Trans. Neural Networks, vol. 4, no. 2, pp.305-318, Mar. 1993.
[9] J.N. Hwang, J.J. Choi, S. Oh, and R.J. Marks II, “Query-Based Learning Applied to Partially Trained Multi-Layer Perceptrons,” IEEE Trans. Neural Networks, vol. 2, no. 1, pp.131-136, Jan. 1991.
[10] J.J. Faraway, “Sequential Design for the Nonparametric Regression of Curves and Surfaces,” Proc. 22nd Symp. Interface between Computing Science and Statistics, pp.104-110, 1990.
[11] T. Kohonen, Self-Organizing Maps, third ed. Springer, 2001.
[12] C.E. Pedreira, L. Macrini, and E.S. Costa, “Input and Data Selection Applied to Heart Disease Diagnosis,” Proc. IEEE-Int'l Neural Network Network Soc.-European Neural Network Soc. Int'l Joint Conf. Neural Networks, 2005.
[13] P. Mitra and S.K. Pal, “A Probabilistic Active Support Vector Learning Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 3, pp.413-418, Mar. 2004.
[14] M. Li and I.K. Sethi, “Confidence-Based Active Learning,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp.1251-1261, Aug. 2006.
[15] D. Xu, “Energy, Entropy and Information Potential for Neural Computation,” PhD thesis, Univ. of Florida, 1999.
[16] S. Vinga and J. Almeida, “Rényi Continuous Entropy of DNA Sequences,” J. Theorectical Biology, vol. 231, no. 3, pp.377-388, 2004.
[17] N. Bouguila and D. Ziou, “A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture,” IEEE Trans. Image Processing, vol. 15, no. 9, pp.2657-2668, 2006.
[18] K. Huang, H. Yang, I. King, M.R. Lyu, and L. Chan, “The Minimum Error Minimax Probability Machine,” J. Machine Learning Research, vol. 5, pp.1253-1286, 2004.
[19] G. Tutz and H. Binder, “Localized Classification,” Statistics and Computing, vol. 15, pp.155-166, 2005.
[20] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.
[21] G.T. Toussaint, “Geometric Proximity Graphs for Improving Nearest Neighbor Methods in Instance-Based Learning and Data Mining”, Int'l J. Computational Geometry and Applications, vol. 15, no. 2, pp.101-150, 2005.
[22] D.W. Aha, D. Kibler, and M. Albert, “Instance-Based Learning Algorithms,” Machine Learning, vol. 6, pp.37-66, 1991.
[23] D.R. Wilson and T.R. Martinez, “Reduction Techniques for Instance-Based Learning Algorithms,” Machine Learning, vol. 38, pp.257-286, 2000.
[24] Z.Q. Hong and J.Y. Yang, “Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the Plane,” Pattern Recognition, vol. 24, no. 4, pp.317-324, 1991.
[25] L. Breiman, J.H. Friedman, A.R. Olshen, and J.C. Stone, Classification and Regression Trees, 1984.

Index Terms:
Classification, neural networks, observations selection, risk zone, support vector machine.
Citation:
R.T. Peres, C.E. Pedreira, "Generalized Risk Zone: Selecting Observations for Classification," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 7, pp. 1331-1337, July 2009, doi:10.1109/TPAMI.2008.269
Usage of this product signifies your acceptance of the Terms of Use.