This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Incremental Syntactic Parsing of Natural Language Corpora with Simple Synchrony Networks
March/April 2001 (vol. 13 no. 2)
pp. 219-231

Abstract—This article explores the use of Simple Synchrony Networks (SSNs) for learning to parse English sentences drawn from a corpus of naturally occurring text. Parsing natural language sentences requires taking a sequence of words and outputting a hierarchical structure representing how those words fit together to form constituents. Feed-forward and Simple Recurrent Networks have had great difficulty with this task, in part because the number of relationships required to specify a structure is too large for the number of unit outputs they have available. SSNs have the representational power to output the necessary $O(n^2)$ possible structural relationships because SSNs extend the $O(n)$ incremental outputs of Simple Recurrent Networks with the $O(n)$ entity outputs provided by Temporal Synchrony Variable Binding. This article presents an incremental representation of constituent structures which allows SSNs to make effective use of both these dimensions. Experiments on learning to parse naturally occurring text show that this output format supports both effective representation and effective generalization in SSNs. To emphasize the importance of this generalization ability, this article also proposes a short-term memory mechanism for retaining a bounded number of constituents during parsing. This mechanism improves the $O(n^2)$ speed of the basic SSN architecture to linear time, but experiments confirm that the generalization ability of SSN networks is maintained.

[1] A.D. Baddeley, Working Memory. New York: Oxford Univ. Press, 1986.
[2] E. Charniak, Statistical Language Learning. MIT Press, 1993.
[3] E. Charniak, “Statistical Techniques for Natural Language Parsing,” AI Magazine, vol. 18, pp. 33–43, 1997.
[4] N. Chomsky, “On Certain Formal Properties of Grammars,” Information and Control, vol. 2, pp. 137–167, 1959.
[5] M. Collins, “Head-Driven Statistical Models of Natural Language Parsing,” PhD thesis, Univ. of Pennsylvania, Philadelphia, 1999.
[6] J.L. Elman, “Finding Structure in Time,” Cognitive Science, vol. 14, pp. 179–211, 1990.
[7] J.L. Elman, “Distributed Representations, Simple Recurrent Networks, and Grammatical Structure,” Machine Learning, vol. 7, pp. 195–225, 1991.
[8] J.L. Elman, “Learning and Development in Neural Networks: The Importance of Starting Small,” Cognition, vol. 48, pp. 71–99, 1993.
[9] J.A. Fodor and Z.W. Pylyshyn, “Connectionism and Cognitive Architecture: A Critical Analysis,” Cognition, vol. 28, pp. 3–71, 1988.
[10] P. Frasconi, M. Gori, and A. Sperduti, “On the Efficient Classification of Data Structures by Neural Networks,” Proc. Int'l Joint Conf. Artificial Intelligence, 1997.
[11] P. Frasconi, M. Gori, and A. Sperduti, “A General Framework for Adaptive Processing of Data Structures,” IEEE Trans. Neural Networks, vol. 9, no. 5, pp. 768–786, 1998.
[12] The Computational Analysis of English: A Corpus-Based Approach. R. Garside, G.Leech, and G. Sampson, eds., Longman Group United Kingdom Limited, 1987.
[13] R.F. Hadley and M.B. Hayward, “Strong Semantic Systematicity from Hebbian Connectionist Learning,” Minds and Machines, vol. 7, pp. 1–37, 1997.
[14] J.B. Henderson, “Description Based Parsing in a Connectionist Network,” PhD thesis, Univ. of Pennsylvania, 1994.
[15] J.B. Henderson, “A Connectionist Architecture with Inherent Systematicity,” Proc. 18th Conf. Cognitive Science Soc., pp. 574–579, 1996.
[16] J.B. Henderson, “A Neural Network Parser that Handles Sparse Data,” Proc. Sixth Int'l Workshop Parsing Technologies, pp. 123–134, 2000.
[17] J.B. Henderson and P.C.R. Lane, “A Connectionist Architecture for Learning to Parse,” Proc. 17th Int'l Conf. Computational Linguistics and the 36th Ann. Meeting of the Assoc. for Computational Linguistics (COLING-ACL'98), pp. 531–537, 1998.
[18] E.K.S. Ho and L.W. Chan, “How to Design a Connectionist Holistic Parser,” Neural Computation, vol. 11, no. 8, pp. 1995–2016, 1999.
[19] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, 1997.
[20] M. Johnson, “PCFG Models of Linguistic Tree Representations,” Computational Linguistics, vol. 24, pp. 613–632, 1998.
[21] P.C.R. Lane, “Simple Synchrony Networks: Learning Generalizations across Syntactic Constituents,” Proc. 13th European Conf. Artificial Intelligence, H. Prade, ed., pp. 469–470, 1998.
[22] P.C.R. Lane, “Simple Synchrony Networks: A New Connectionist Architecture Applied to Natural Language Parsing,” PhD thesis, Dept. Computer Science, Univ. of Exeter, England, 2000.
[23] P.C.R. Lane and J.B. Henderson, “Simple Synchrony Networks: Learning to Parse Natural Language with Temporal Synchrony Variable Binding,” Proc. Eighth Int'l Conf. Artificial Neural Networks, pp. 615–620, 1998.
[24] S. Lawrence, S. Fong, and C.L. Giles, “Natural Language Grammatical Inference: A Comparison of Recurrent Neural Networks and Machine Learning Methods,” Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, S. Wermter, E. Riloff, and G. Scheler, eds., 1996.
[25] S. Lawrence, C.L. Giles, and S. Fong, “Natural Language Grammatical Inference with Recurrent Neural Networks,” IEEE Trans. Knowledge and Data Eng., vol. 12, no. 1, Jan./Feb. 2000.
[26] M. Marcus, A Theory of Syntactic Recognition for Natural Language. Cambridge Mass.: MIT Press, 1980.
[27] I. Melcuk, Dependency Syntax: Theory and Practice. SUNY Press, 1988.
[28] G.A. Miller, “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information,” Psychological Rev., vol. 63, pp. 81–97, 1956.
[29] Proc. Eighth Int'l Conf. Artificial Neural Networks, L. Niklasson, M. Boden, and T. Ziemke, eds., 1998.
[30] R.G. Reilly, “Enriched Lexical Representations, Large Corpora, and the Performance of SRNs,” Proc. Eighth Int'l Conf. Artifical Neural Networks, pp. 405–410, 1998.
[31] D.E. Rumelhart, G.E. Hinton, and R.J. Williams, "Learning Internal Representations by Error Propagation," Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: Foundations, D.E. Rumelhart and J.L. McClelland et al., eds., chapter 8, pp. 318-362.Cambridge, Mass.: MIT Press, 1986.
[32] G. Sampson, English for the Computer. Oxford, United Kingdom: Oxford Univ. Press, 1995.
[33] L. Shastri and V. Ajjanagadde, “From Simple Associations to Systematic Reasoning: A Connectionist Representation of Rules, Variables, and Dynamic Bindings using Temporal Synchrony,” Behavioral and Brain Sciences, vol. 16, pp. 417–494, 1993.
[34] A. Sperduti, “Stability Properties of Labeling Recursive Auto-Associative Memory,” IEEE Trans. Neural Networks, vol. 6, pp. 1452-1460, 1995.
[35] A. Sperduti and A. Starita, “Supervised Neural Networks for the Classification of Structures,” IEEE Trans. Neural Networks, vol. 8, no. 3, pp. 714–735, 1997.
[36] C. von der Malsburg, “The Correlation Theory of Brain Function,” Technical Report 81-2, Max-Planck-Inst. for Biophysical Chemistry, Gottingen, 1981.

Index Terms:
Connectionist networks, natural language processing, simple synchrony networks, syntactic parsing, temporal synchrony variable binding.
Citation:
Peter C.R. Lane, James B. Henderson, "Incremental Syntactic Parsing of Natural Language Corpora with Simple Synchrony Networks," IEEE Transactions on Knowledge and Data Engineering, vol. 13, no. 2, pp. 219-231, March-April 2001, doi:10.1109/69.917562
Usage of this product signifies your acceptance of the Terms of Use.