This Article 
 Bibliographic References 
 Add to: 
Rule Revision With Recurrent Neural Networks
February 1996 (vol. 8 no. 1)
pp. 183-188

Abstract-Recurrent neural networks readily process, recognize and generate temporal sequences. By encoding grammatical strings as temporal sequences, recurrent neural networks can be trained to behave like deterministic sequential finite-state automata. Algorithms have been developed for extracting grammatical rules from trained networks. Using a simple method for inserting prior knowledge (or rules) into recurrent neural networks, we show that recurrent neural networks are able to perform rule revision. Rule revision is performed by comparing the inserted rules with the rules in the finite-state automata extracted from trained networks. The results from training a recurrent neural network to recognize a known non-trivial, randomly generated regular grammar show that not only do the networks preserve correct rules but that they are able to correct through training inserted rules which were initially incorrect. (By incorrect, we mean that the rules were not the ones in the randomly generated grammar.)

[1] Y. Abu-Mostafa,"Learning from hints in neural networks," J. Complexity, vol. 6, pp. 192, 1990.
[2] K. Al-Mashouq and I. Reed,"Including hints in training neural nets," Neural Computation, vol. 3, no. 3, pp. 418-427, 1991.
[3] H. Berenji,"Refinement of approximate reasoning-based controllers by reinforcement learning," Machine Learning, Proc. Eighth International Int'l Workshop,San Mateo, Calif, L. Birnbaum and G. Collins, eds., p. 475. Morgan Kaufmann Publishers, 1991.
[4] S. Das,C. Giles, and G. Sun,"Learning context-free grammars: Limitations of a recurrent neural network with an external stack memory," Proc. 14th Ann. Conf. Cognitive Science Society,San Mateo, Calif., pp. 791-795. Morgan Kaufmann Publishers, 1992.
[5] P. Frasconi,M. Gori,M. Maggini,, and G. Soda,“An unified approach for integrating explicit knowledge and learningby example in recurrent networks,” Proc. IEEE-IJCNN91,Seattle, pp. 811-816, 1991.
[6] P. Frasconi, M. Gori, M. Maggini, and G. Soda, “Unified Integration of Explicit Knowledge and Learning by Example in Recurrent Networks,” IEEE Trans. Knowledge and Data Eng., vol. 7, no. 2, pp. 340–346, Apr. 1995.
[7] S. Chakrabarti and K. Yelick,“Implementing an irregular application on a distributed memorymultiprocessor,” Proc. of the Fourth ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPOPP), ACM SIGPLAN Notices, vol. 28, no. 7, May 1993.
[8] C. Giles and T. Maxwell,"Learning, invariance, and generalization in high-order neural networks," Applied Optics, vol. 26, no. 23, pp. 4,972-4,978, 1987.
[9] C.L. Giles, C.B. Miller, D. Chen, H.H. Chen, G.Z. Sun, and Y.C. Lee, “Learning and Extracted Finite State Automata with Second-Order Recurrent Neural Networks,” Neural Computation, vol. 4, no. 3, pp. 393–405, 1992.
[10] C. Giles and C. Omlin,"Inserting rules into recurrent neural networks," Neural Networks for Signal Processing II, Proc. 1992 IEEE Workshop, pp. 13-22, S. Kung, F. Fallside, J.A. Sorenson, and C. Kamm, eds., Piscataway, N.J.: IEEE Press, 1992.
[11] A. Ginsberg,"Theory revision via prior operationalization," Proc. Sixth Nat'l Conf. Artificial Intelligence, p. 590, 1988.
[12] J.E. Hopcroft and J.D. Ullman, Introduction to Automata Theory, Languages and Computation. Addison-Wesley, Apr. 1979.
[13] R. Maclin and J. Shavlik,"Refining algorithms with knowledge-based neural networks: Improving the Chou-Fasman algorithm for protein folding," Computational Learning Theory and Natural Learning Systems, S. Hanson, G. Drastal, and R. Rivest, eds., MIT Press, 1992.
[14] O. Nerrand,P. Roussel-Ragot,G.D.L. Personnaz, and S. Marcos,"Neural networks and non-linear adaptive filtering: Unifying concepts and new algorithms," Neural Computation, vol. 5, pp. 165-197, 1993.
[15] C.W. Omlin and C.L. Giles, "Extraction of Rules from Discrete-Time Recurrent Neural Networks," Neural Networks, vol. 9, no. 1, pp. 41-52, 1996.
[16] C. Omlin and C. Giles,"Training second-order recurrent neural networks using hints," Proc. Ninth Int'l Conf. Machine Learning,San Mateo, Calif., D. Sleeman and P. Edwards, eds., pp. 363-368, Morgan Kaufmann Publishers, 1992.
[17] C. Omlin,C. Giles, and C. Miller,"Heuristics for the extraction of rules from discrete-time recurrent neural networks," Proc. Int'l Joint Conf. Neural Networks 1992, vol. I, pp. 33-38, June 1992.
[18] D. Oursten and R. Mooney,"Changing rules: A comprehensive approach to theory refinement," Proc. Eighth National Conf. Artificial Intelligence, p. 815, 1990.
[19] M. Pazzani,"Detecting and correcting errors of omission after explanation-based learning," Proc. 11th Int'l Joint Conf. Artificial Intelligence, p. 713, 1989.
[20] S.J. Perantonis and P.J. Lisboa, "Translation, Rotation, and Scale Invariant Pattern Recognition by High-Order Neural Networks and Moment Classifiers," IEEE Trans. Information Theory, vol. 3, pp. 241-251, 1992.
[21] J. Pollack,"The induction of dynamical recognizers," Machine Learning, vol. 7, nos. 2/3, pp. 227-252, 1991.
[22] L. Pratt,"Non-literal transfer of information among inductive learners," Neural Netwoorks: Theory and Applications II, R. Mammone and Y. Zeevi, eds., Academic Press, 1992.
[23] J. W. Shavlik,"A framework of combining symbolic and neural learning," Machine Learning, vol. 14, no. 3, pp. 321-331, 1994.
[24] E.I. Siegelmann and E. Sontag,"Turing computability with neural nets," Applied Mathematics Letters, vol. 4, no. 6, pp. 77-80, 1991.
[25] S. Suddarth and A. Holden,"Symbolic neural systems and the use of hints for developing complex systems," Int'l J. Man-Machine Studies, vol. 34, pp. 291-311, 1991.
[26] G. Towell,M. Craven, and J. Shavlik,"Constructive induction using knowledge-based neural networks," Eighth Int'l Machine Learning Workshop, L. Birnbaum and G. Collins, eds., p. 213,San Mateo, Calif. Morgan Kaufmann Publishers, 1990.
[27] P. Tino and J. Sajda,"Learning and extracting initial mealy machines with a modular neural network model," Neural Computation, vol. 7, no. 4, pp. 882-884, 1995.
[28] R. Watrous and G. Kuhn,"Induction of finite-state languages using second-order recurrent networks," Neural Computation, vol. 4, no. 3, p. 406, 1992.
[29] R. Williams and D. Zipser,"A learning algorithm for continually running fully recurrent neural networks," Neural Computation, vol. 1, no. 2, pp. 270-280, 1989.
[30] P. Manolios and R. Fanelli,"First order recurrent neural networks and deterministic finite state automata," Neural Computation, vol. 6, no. 6, pp. 1,154-1,172, 1994.
[31] C.W. Omlin and C.L. Giles, “Stable Encoding of Large Finite-State Automata in Recurrent Neural Networks with sigmoid Discriminants,” Neural Computation, vol. 8, pp. 675–696, 1996.
[32] C.W. Omlin and C.L. Giles, “Constructing Deterministic Finite-State Automata in Recurrent Neural Networks,” J. ACM, vol. 43, no. 6, pp. 937–972, 1996.

Index Terms:
Deterministic finite-state automata, genuine and incorrect rules, knowledge insertion and extraction, recurrent neural networks, regular languages, rule revision.
Christian W. Omlin, C. Lee Giles, "Rule Revision With Recurrent Neural Networks," IEEE Transactions on Knowledge and Data Engineering, vol. 8, no. 1, pp. 183-188, Feb. 1996, doi:10.1109/69.485647
Usage of this product signifies your acceptance of the Terms of Use.