The Community for Technology Leaders
Green Image
<p>The possibility of using stochastic context-free grammars (SCFG's) in language modeling (LM) has been considered previously. When these grammars are used, search can be directed by evaluation functions based on the probabilities that a SCFG generates a sentence, given only some words in it. Expressions for computing the evaluation function have been proposed by Jelinek and Lafferty (1991) for the recognition of word sequences in the case in which only the prefix of a sequence is known. Corazza et al. (1991) have proposed methods for probability computation in the more general case in which partial word sequences interleaved by gaps are known. This computation is too complex in practice unless the lengths of the gaps are known. This paper proposes a method for computing the probability of the best parse tree that can generate a sentence only part of which (consisting of islands and gaps) is known. This probability is the minimum possible, and thus the most informative, upper-bound that can be used in the evaluation function. The computation of the proposed upper-bound has cubic time complexity even if the lengths of the gaps are unknown. This makes possible the practical use of SCFG for driving interpretations of sentences in natural language processing.</p>
context-sensitive grammars; computational complexity; natural languages; probability; optimal probabilistic evaluation functions; search; stochastic context-free grammars; language modeling; probabilities; word sequences; probability computation; partial word sequences; best parse tree; cubic time complexity; natural language processing

A. Corazza, R. De Mori, G. Satta and R. Gretter, "Optimal Probabilistic Evaluation Functions for Search Controlled by Stochastic Context-Free Grammars," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 16, no. , pp. 1018-1027, 1994.
193 ms
(Ver 3.3 (11022016))