This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Binary Rule Generation via Hamming Clustering
November/December 2002 (vol. 14 no. 6)
pp. 1258-1268

Abstract—The generation of a set of rules underlying a classification problem is performed by applying a new algorithm called Hamming Clustering (HC). It reconstructs the and-or expression associated with any Boolean function from a training set of samples. The basic kernel of the method is the generation of clusters of input patterns that belong to the same class and are close to each other according to the Hamming distance. Inputs which do not influence the final output are identified, thus automatically reducing the complexity of the final set of rules. The performance of HC has been evaluated through a variety of artificial and real-world benchmarks. In particular, its application in the diagnosis of breast cancer has led to the derivation of a reduced set of rules solving the associated classification problem.

[1] S.I. Gallant, Neural Networks Learning and Expert Systems. Cambridge, Mass.: MIT Press, 1993.
[2] D. Liberati, “Expert Systems: The State of the Art,” The Ligand Quarterly, vol. 8, pp. 606-611, 1989.
[3] B. Buchanan and E. Shortliffe, Rule-Based Expert Systems. Reading, Mass.: Addison-Wesley, 1984.
[4] J. McDermott, “R1: The Formative Years,” AI Magazine, vol. 2, pp. 21-29, 1981.
[5] R. Andrews, J. Diederich, and A. Tickle, “A Survey and Critique of Techniques for Extracting Rules from Trained Artificial Neural Networks,” Knowledge-Based Systems, vol. 8, pp. 373-389, 1995.
[6] I.A. Taha and J. Ghosh, “Symbolic Interpretation of Artificial Neural Networks,” IEEE Trans. Knowledge and Data Eng., vol. 11, pp. 448-463, 1999.
[7] R.M. Goodman, C.M. Higgins, J.W. Miller, and P. Smyth, “Rule-Based Neural Networks for Classification and Probability Estimation,” Neural Computation, vol. 4, pp. 781-804, 1992.
[8] K.-P. Huber and M.R. Berthold, “Building Precise Classifiers with Automatic Rule Extraction,” Proc. IEEE Int'l Conf. Neural Networks, pp. III-1263-1268, 1995.
[9] L.M. Fu, Neural Networks in Computer Intelligence.McGraw-Hill, 1994.
[10] G.G. Towell and J.W. Shavlik, "The Extraction of Refined Rules from Knowledge-Based Neural Networks," Machine Learning, vol. 13, no. 1, pp. 71-101, 1993.
[11] R. Setiono and H. Liu, "Symbolic Representation of Neural Networks," Computer, pp. 71-77, Mar. 1996.
[12] R. Setiono, ExtractingM-of-NRules from Trained Neural Networks IEEE Trans. Neural Networks, vol. 11, no. 2, pp. 512-519, Mar. 2000.
[13] M. Ishikawa, “Rule Extraction by Successive Regularization,” Neural Networks, vol. 13, pp. 1171-1183, 2000.
[14] C.T. Lin and C.S.G. Lee, "Neural-Network-Based Fuzzy Logic Control and Decision System," IEEE Trans. Computers, vol. 40, no. 12, pp. 1,320-1,326, Dec. 1991.
[15] S. Horikawa, T. Furuhashi, and Y. Uchikawa, "On Fuzzy Modeling Using Fuzzy Neural Networks with Back-Propagation Algorithm," IEEE Trans. Neural Networks, vol. 3, no. 5, pp. 801-806, Sept. 1992.
[16] P.K. Simpson, Fuzzy Min-Max Neural Networks-Part 2: Clustering IEEE Trans. Fuzzy Systems, vol. 1, no. 1, pp. 32-45, Feb. 1993.
[17] M. Setnes, “Supervised Fuzzy Clustering for Rule Extraction,” IEEE Trans. Fuzzy Systems, vol. 8, pp. 416-424, 2000.
[18] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. Belmont: Wadsworth, 1994.
[19] J.R. Quinlan, C4.5: Programs for Machine Learning,San Mateo, Calif.: Morgan Kaufman, 1992.
[20] J.R. Quinlan, “Generating Production Rules from Decision Trees,” Proc. 10th Int'l Joint Conf. Artificial Intelligence, pp. 304-307, 1987.
[21] G. Pagallo, “Learning DNF by Decision Trees,” Proc. 11th Int'l Joint Conf. Artificial Intelligence, pp. 639-644, 1989.
[22] S. Muggleton, Inductive Logic Programming. New York: Academic Press, 1992.
[23] S. Muggleton and L. De Raedt, “Inductive Logic Programming: Theory and Methods,” J. Logic Programming, vol. 19/20, pp. 629-679, 1994.
[24] J.R. Quinlan and R.M. Cameron-Jones, “Induction of Logic Programs: Foil and Related Systems,” New Generation Computing, vol. 13, pp. 287-312, 1995.
[25] P.R.J. van der Laag and S.-H. Nienhuys-Cheng, “Completness and Properness of Refinement Operators in Inductive Logic Programming,” J. Logic Programming, vol. 34, pp. 201-225, 1998.
[26] J. Fürnkranz, “Pruning Algorithms for Rule Learning,” Machine Learning, vol. 27, pp. 139-171, 1997.
[27] J.V. Jaskolski, “Construction of Neural Network Classification Expert Systems Using Switching Theory Algorithms,” Proc. Int'l Joint Conf. Neural Networks, pp. I-1-6, 1992.
[28] S.J. Hong, "R-Mini: An Iterative Approach for Generating Minimal Rules From Examples," IEEE Trans. Knowledge and Data Engineering, vol. 9, pp. 709-717, 1997.
[29] H.W. Gschwind and E.J. McCluskey, Design of Digital Computers. New York: Springer-Verlag, 1975.
[30] T. Downs and M.F. Schultz, Logic Design with Pascal. New York: Van Nostrand Reinhold, 1988.
[31] R.K. Brayton, G.D. Hachtel, C.T. McMullen, and A.L. Sangiovanni-Vincintelli, Logic Minimization Algorithms for VLSI Synthesis.Boston: Kluwer Academic, 1984.
[32] S.J. Hong, R.G. Cain, and D.L. Ostapko, “MINI: A Heuristic Approach for Logic Minimization,” IBM J. Research and Development, vol. 18, pp. 443-458, 1974.
[33] D.L. Dietmeyer,Logic Design of Digital Systems, 3rd ed. Boston: Allyn and Bacon, Inc., 1988.
[34] M. Muselli, “Predicting the Generalization Ability of Neural Networks Resembling the Nearest-Neighbor Algorithm,” Proc. Int'l Joint Conf. on Neural Networks (IJCNN 2000), pp. I-27-33, 2000.
[35] M. Muselli and D. Liberati, “Training Digital Circuits with Hamming Clustering,” IEEE Trans. Circuit and Systems—I: Fundamental Theory and Applications, vol. 47, pp. 513-527, 2000.
[36] C. Mead, Analog VLSI and Neural Systems, Addison-Wesley, Reading, Mass., 1989.
[37] F.N. Sibai and S.D. Kulkarni, “A Time-Multiplexed Reconfigurable Neuroprocessor,” IEEE Micro, vol. 17, pp. 58-65, 1997.
[38] S.B. Thrun, J. Bala, E. Bloedorn, I. Bratko, B. Cestnik, K. De Jong, S. Dzeroski, S.E. Fahlman, D. Fisher, R. Hamann, K. Kaufman, S. Keller, I. Kononenko, J. Kreuziger, R.S. Michalski, T. Mitchell, P. Pachowicz, Y. Reich, H. Vafaie, W. Van de Welde, W. Wenzel, J. Wnek, and J. Zhang, “A Performance Comparison of Different Learning Algorithms,” Technical Report CMU-CS-91-197,: Department of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, 1991.
[39] Machine Learning, Neural, and Statistical Classification. D. Michie, D. Spiegelhalter, and C. Taylor, eds., London: Ellis-Horwood, 1994.
[40] C.J. Merz and P.M. Murphy, “UCI Repository of Machine Learning Databases” http://www.ics.uci.edu/~mlearnMLRepository.html , Irvine, Dept. of Information and Computer Science, Univ. of California, Irvine, 1996.
[41] O.L. Mangasarian and W.H. Wolberg, “Cancer Diagnosis via Linear Programming,” SIAM News, vol. 23, pp. 1-18, 1990.
[42] M.N. Murty, A.K. Jain, and P.J. Flynn, “Data Clustering: A Review,” ACM Computing Surveys, vol. 31, no. 3, pp. 264-323, 1999.
[43] R.M. Gray, "Vector Quantization," IEEE Acoustics, Speech and Signal Processing, pp. 4-29, Apr. 1984.
[44] T. Kohonen, "Self-Organization and Associated Memory," Berlin Heidelberg. New York: Springer-Verlag, 1988.
[45] B. Fritzke, “A Growing Neural Gas Network Learns Topologies,” Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, eds., Cambridge, MA: MIT Press, pp. 625-632, 1995.
[46] G.A. Carpenter, S. Grossberg, and D.B. Rosen, “Fuzzy ART: Fast Stable Learning and Categorization of Analog Patterns by an Adaptive Resonance System,” Neural Networks, vol. 4, pp. 759–771, 1991.
[47] N.B. Karayiannis and J.C. Bezdek, “An Integrated Approach to Fuzzy Learning Vector Quantization and Fuzzy$\big. C{\hbox{-}}{\rm{means}}\bigr.$Clustering,” IEEE Trans. Fuzzy Systems, vol. 5, pp. 622-628, 1997.
[48] D.E. Rumelhart, G.E. Hinton, and R.J. Williams, "Learning Internal Representations by Error Propagation," Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: Foundations, D.E. Rumelhart and J.L. McClelland et al., eds., chapter 8, pp. 318-362.Cambridge, Mass.: MIT Press, 1986.
[49] S. Hong, "Use of Contextual Information for Feature Ranking and Discretization," IEEE Trans. Knowledge and Data Eng., vol. 9, no. 5, pp. 718-730, Sept./Oct. 1997.
[50] W. Iba, J. Wogulis, and P. Langley, “Trading Off Simplicity and Coverage in Incremental Concept Learning,” Proc. Fifth Int'l Conf. Machine Learning, pp. 73-79, Ann Arbor, Mich.: Morgan Kaufmann, 1988.
[51] S. Menet, P. Saint-Marc, and G. Medioni, "Active Contour Models: Overview, Implementation and Applications," Int'l Conf. Systems, Man, and Cybernetics, vol. 212, pp. 194-199, 1990.
[52] R. Setiono, “Generating Concise and Accurate Classification Rules for Breast Cancer Diagnosis,” Artificial Intelligence in Medicine, vol. 18, pp. 205-219, 2000.
[53] G.P. Drago and S. Ridella, “Pruning with Interval Arithmetic Perceptron,” Neurocomputing, vol. 18, pp. 229-246, 1998.
[54] J.M. Steppe and K.W. Bauer, “Improved Feature Screening in Feedforward Neural Networks,” Neurocomputing, vol. 13, pp. 47-58, 1996.

Index Terms:
Rule generation, Hamming clustering, knowledge discovery, Boolean function approximation, generalization.
Citation:
Marco Muselli, Diego Liberati, "Binary Rule Generation via Hamming Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 6, pp. 1258-1268, Nov.-Dec. 2002, doi:10.1109/TKDE.2002.1047766
Usage of this product signifies your acceptance of the Terms of Use.