Issue No. 05 - May (1978 vol. 27)
R.L. Kashyap , School of Electrical Engineering, Purdue University
We describe a method of recognizing isolated words and phrases from a given vocabulary spoken by any member in a given group of speakers, the identity of the speaker being unknown to the system. The word utterance is divided into 20-30 nearly equal frames, frame boundaries being aligned with glottal pulses for voiced speech. A constant number of pitch periods are included in each frame. Statistical decision rules are used to determine the phoneme in each frame. Using the string of phonemes from all the frames of the utterance, a word decision is obtained using (phonological) syntactic rules. The syntactic rules used here are of 2 types, namely, 1) those obtained from the theory of word construction from phonemes in English as applied to our vocabulary, 2) those used to correct possible errors in phonemic decisions obtained earlier based on the decisions of neighboring segments. In our experiment, the vocabulary had 40 words, consisting of many pairs of words which are phonemically close to each other. The number of speakers was 6. The identity of the speaker is not known to the system. In testing 400 words utterances, the recognition rate was about 80 percent for phonemes (for 11 phonemes) but the word recognition was 98.1 percent correct. Phonological-syntactic rules played an important role in upgrading the word recognition rate over the phoneme recognition rate.
use of statistical information in syntactic methods, Error correction of symbol strings, feature extraction and pattern recognition multiple talker environment, spoken word recognition, syntactic approach in speech recognition
R. Kashyap and M. Mittal, "Recognition of Spoken Words and Phrases in Multitalker Environment Using Syntactic Methods," in IEEE Transactions on Computers, vol. 27, no. , pp. 442-452, 1978.