A Cache-Based Natural Language Model for Speech Recognition
June 1990 (vol. 12 no. 6)
pp. 570-583

Speech-recognition systems must often decide between competing ways of breaking up the acoustic input into strings of words. Since the possible strings may be acoustically similar, a language model is required; given a word string, the model returns its linguistic probability. Several Markov language models are discussed. A novel kind of language model which reflects short-term patterns of word use by means of a cache component (analogous to cache memory in hardware terminology) is presented. The model also contains a 3g-gram component of the traditional type. The combined model and a pure 3g-gram model were tested on samples drawn from the Lancaster-Oslo/Bergen (LOB) corpus of English text. The relative performance of the two models is examined, and suggestions for the future improvements are made.

Index Terms:
cache-based natural language model; speech recognition; word string; linguistic probability; Markov language models; Lancaster-Oslo/Bergen; English text; Markov processes; natural languages; probability; speech recognition
R. Kuhn, R. De Mori, "A Cache-Based Natural Language Model for Speech Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 6, pp. 570-583, June 1990, doi:10.1109/34.56193
