|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Kenneth E. Hild, Deniz Erdogmus, Kari Torkkola, Jose C. Principe, "Feature Extraction Using Information-Theoretic Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 9, pp. 1385-1392, September, 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/TPAMI.2006.186, author = {Kenneth E. Hild and Deniz Erdogmus and Kari Torkkola and Jose C. Principe}, title = {Feature Extraction Using Information-Theoretic Learning}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {28}, number = {9}, issn = {0162-8828}, year = {2006}, pages = {1385-1392}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2006.186}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - Feature Extraction Using Information-Theoretic Learning IS - 9 SN - 0162-8828 SP1385 EP1392 EPD - 1385-1392 A1 - Kenneth E. Hild, A1 - Deniz Erdogmus, A1 - Kari Torkkola, A1 - Jose C. Principe, PY - 2006 KW - Feature extraction KW - information theory KW - classification KW - nonparametric statistics. VL - 28 JA - IEEE Transactions on Pattern Analysis and Machine Intelligence ER - | |||
[1] B.D. Ripley, Pattern Recognition and Neural Networks. Cambridge Univ. Press, 1995.
[2] T.M. Cover and J.A. Thomas, Elements of Information Theory. John Wiley & Sons, 1991.
[3] J.C. Principe, D. Xu, Q. Zhao, and J.W. FisherIII, “Learning from Examples with Information Theoretic Criteria,” J. VLSI Signal Proc. Systems, vol. 26, nos. 1/2, pp. 61-77, Aug. 2000.
[4] D. Erdogmus and J.C. Principe, “Lower and Upper Bounds for Misclassification Probability Based on Renyi's Information,” J. VLSI Signal Processing, vol. 37, nos. 2-3, pp. 305-317, June 2004.
[5] M.E. Hellman and J. Raviv, “Probability of Error, Equivocation, and the Chernoff Bound,” IEEE Trans. Information Theory, vol. 16, no. 4, pp. 368-372, July 1970.
[6] R. Battiti, “Using Mutual Information for Selecting Features in Supervised Neural Net Learning,” IEEE Trans. Neural Networks, vol. 5, no. 4, pp. 537-550, July 1994.
[7] H.H. Yang and J. Moody, “Feature Selection Based on Joint Mutual Information,” Proc. Conf. Advances in Intelligent Data Analysis, Computational Intelligence Methods, and Applications, June 1999.
[8] K.D. Bollacker and J. Ghosh, “Mutual Information Feature Extractors for Neural Classifiers,” Proc. Int'l Conf. Neural Networks (ICNN '96), pp. 1528-1533, June 1996.
[9] N. Kwak and C.-H. Choi, “Improved Mutual Information Feature Selector for Neural Networks in Supervised Learning,” Proc. Int'l Joint Conf. Neural Networks, vol. 2, pp. 1313-1318, July 1999.
[10] R. Rajagopal, K.A. Kumar, and P.R. Rao, “An Integrated Approach to Passive Target Classification,” Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 2, pp. 313-316, Apr. 1994.
[11] K.E. HildII, D. Erdogmus, and J.C. Principe, “An Analysis of Entropy Estimators for Blind Source Separation,” Signal Processing, vol. 86, no. 1, pp. 182-194, Jan. 2006.
[12] A. Renyi, Probability Theory. Amsterdam: North-Holland Publishing Company, 1970.
[13] K.E. HildII, D. Erdogmus, and J.C. Principe, “On-Line Minimum Mutual Information Method for Time-Varying Blind Source Separation,” Proc. Int'l Workshop Independent Component Analysis and Signal Separation, pp. 126-131, Dec. 2001.
[14] D. Erdogmus, K.E. HildII, and J.C. Principe, “On-Line Entropy Manipulation: Stochastic Information Gradient,” IEEE Signal Processing Letters, vol. 10, no. 8, pp. 242-245, Aug. 2003.
[15] J. Beirlant, E.J. Dudewica, L. Gyofi, and E. van der Meulen, “Nonparametric Entropy Estimation: An Overview,” Int'l J. Math. Statistics Sciences, vol. 6, no. 1, pp. 17-39, 1997.
[16] E. Parzen, “On Estimation of a Probability Density Function and Mode,” Annals of Math. Statistics, vol. 33, no. 3, pp. 1065-1076, Sept. 1962.
[17] G.H. Golub and C.F. Van Loan, Matrix Computations, third ed. Baltimore: John Hopkins Univ. Press, 1996.
[18] S. Theodoridis and K. Koutroumbas, Pattern Recognition. San Diego, Calif.: Academic Press, 1999.
[19] K.E. HildII, D. Erdogmus, and J.C. Principe, “Blind Source Separation Using Renyi's Mutual Information,” IEEE Signal Processing Letters, vol. 8, no. 6, pp. 174-176, June 2001.
[20] R.A. Morejon, “An Information-Theoretic Approach to Sonar Automatic Target Recognition,” PhD dissertation, Univ. of Florida, 2003.
[21] C. Bishop, Neural Networks for Pattern Recognition. Oxford, U.K.: Oxford Univ. Press, 1995.
[22] S.C. Fralick and R.W. Scott, “Nonparametric Bayes-Risk Estimation,” IEEE Trans. Information Theory, vol. 17, no. 4 pp. 440-444, July 1971.
[23] K. Torkkola, “On Feature Extraction by Mutual Information Maximization,” Proc. Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 821-825, May 2002.
[24] K. Torkkola, “Learning Discriminative Feature Transforms to Low Dimensions in Low Dimensions,” Proc. Conf. Advances in Neural Information Processing Systems, Dec. 2001.
[25] K. Torkkola and W.M. Campbell, “Mutual Information in Learning Feature Transformations,” Proc. Int'l Conf. Machine Learning, pp. 1015-1022, June 2000.
[26] K. Torkkola, “Visualizing Class Structure in Data Using Mutual Information,” Proc. Conf. Neural Networks for Signal Proc. (NNSP '00), pp. 376-385, Dec. 2000.
[27] D. Xu and J.C. Principe, “Feature Evaluation Using Quadratic Mutual Information,” Proc. Int'l Joint Conf. Neural Networks, vol. 1, pp. 459-463, July 2001.
[28] A. Biem, S. Katagiri, and B.-H. Juang, “Pattern Recognition Using Discriminative Feature Extraction,” IEEE Trans. Signal Processing, vol. 45, no. 2, pp. 500-504, Feb. 1997.
[29] H. Watanabe, T. Yamaguchi, and S. Katagiri, “Discriminative Metric Design for Robust Pattern Recognition,” IEEE Trans. Signal Processing, vol. 45, no. 11, pp. 2655-2662, Nov. 1997.
[30] S. Katagiri, B.-H. Juang, and C.-H. Lee, “Pattern Recognition Using a Family of Design Algorithms Based upon the Generalized Probabilistic Descent Method,” Proc. IEEE, vol. 86, no. 11, pp. 2345-2373, Nov. 1998.
[31] B.-H. Juang and S. Katagiri, “Discriminative Learning for Minimum Error Classification,” IEEE Trans. Signal Processing, vol. 40, no. 12, pp. 3043-3054, Dec. 1992.
[32] A. Biem, S. Katagiri, and B.-H. Juang, “Discriminative Feature Extraction for Speech Recognition,” Proc. Conf. Neural Networks for Signal Processing (NNSP '93), pp. 392-401, Sept. 1993.
[33] Q. Li and B.-H. Juang, “A New Algorithm for Fast Discriminative Training,” Proc. Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '02), vol. 1, pp. 97-100, May 2002.
[34] V. Nedeljkovic, “A Novel Multilayer Neural Networks Training Algorithm that Minimizes the Probability of Classification Error,” IEEE Trans. Neural Networks, vol. 4, no. 4, pp. 650-659, July 1993.
[35] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed. Boston: Academic Press, 1990.
[36] D. Erdogmus, K.E. HildII, and J.C. Principe, “Kernel Size Selection in Parzen Density Estimation,” J. VLSI Signal Processing Systems, submitted.
[37] D. Erdogmus and J.C. Principe, “Generalized Information Potential Criterion for Adaptive System Training,” IEEE Trans. Neural Networks, Sept. 2002.
[38] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.

