Acoustics, Speech, and Signal Processing, IEEE International Conference on (1995)
Detroit, MI, USA
May 9, 1995 to May 12, 1995
R. Haeb-Umbach , Philips GmbH Forschungslab., Aachen, Germany
P. Beyerlein , Philips GmbH Forschungslab., Aachen, Germany
E. Thelen , Philips GmbH Forschungslab., Aachen, Germany
We address the problem of automatically finding an acoustic representation (i.e. a transcription) of unknown words as a sequence of subword units, given a few sample utterances of the unknown words, and an inventory of speaker-independent subword units. The problem arises if a user wants to add his own vocabulary to a speaker-independent recognition system simply by speaking the words a few times. Two methods are investigated which are both based on a maximum-likelihood formulation of the problem. The experimental results show that both automatic transcription methods provide a good estimate of the acoustic models of unknown words. The recognition error rates obtained with such models in a speaker-independent recognition task are clearly better than those resulting from separate whole-word models. They are comparable with the performance of transcriptions drawn from a dictionary.
R. Haeb-Umbach, P. Beyerlein and E. Thelen, "Automatic transcription of unknown words in a speech recognition system," Acoustics, Speech, and Signal Processing, IEEE International Conference on(ICASSP), Detroit, MI, USA, 1995, pp. 840-843.