This Article 
 Bibliographic References 
 Add to: 
Speaker Adaptation in a Large-Vocabulary Gaussian HMM Recognizer
September 1990 (vol. 12 no. 9)
pp. 917-920

The problem of using a small amount of speech data to adapt a set of Gaussian HMMs (hidden Markov models) that have been trained on one speaker to recognize the speech of another is considered. The authors experimented with a phoneme-dependent spectral mapping for adapting the mean vectors of the multivariate Gaussian distributions (a method analogous to the confusion matrix method that has been used to adapt discrete HMMs), and a heuristic for estimating covariance matrices from small amounts of data. The best results were obtained by training the mean vectors individually from the adaptation data and using the heuristic to estimate distinct covariance matrices for each phoneme.

[1] S. E. Levinson, L. R. Rabiner, and M. M. Sondhi, "An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition,"Bell Syst. Tech. J., vol. 62, pp. 1035-1074, 1983.
[2] V. N. Gupta, M. Lennig, and P. Mermelstein, "Integration of acoustic information in a large vocabulary word recognizer, inProc. ICASSP, 1987, pp. 697-700.
[3] L. Deng, M. Lennig, F. Seitz, V. Gupta, P. Kenny, and P. Mermelstein, "large vocabulary word recognition using context-dependent allophonic hidden Markov models," submitted to Comput.Speech Language.
[4] S.B. Davis and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,"IEEE Trans. Acoustics, Speech, and Signal Processing, Vol. ASSP-28, No. 4, 1980, pp. 357-365.
[5] K. Sugawara, M. Nishimura, and A. Kuroda, "Speaker adaptation for a hidden Markov model," inProc. ICASSP, 1986, pp. 2667- 2670.
[6] A. Jarre and R. Pieraccini, "Some experiments on HMM speaker adaptation," inProc. ICASSP, 1987, pp. 1273-1276.
[7] R. Schwartz, Y.-L. Chow, and F. Kubala, "Rapid speaker adaptation using a probablistic spectral mapping," inProc. ICASSP, 1987, pp. 633-636.
[8] L. R. Bahl, R. L. Mercer, and D. Nahamoo, "An algorithm for estimating the parameters of hidden Markov models from a short training script," presented at the IEEE Workshop on Speech Recognition, 1988.
[9] L. E. Baum, "An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes,"Inequalities, vol. 3, pp. 1-8, 1972.
[10] E. A. Martin, R. P. Lippmann, and D. B. Paul, "Two-stage discriminant analysis for improved isolated-word recognition, inProc. ICASSP, 1987, pp. 709-713.
[11] V. N. Gupta, M. Lennig, and P. Mermelstein, "Fast search strategy in a large vocabulary word recognizer," submitted toJ. Acoust. Soc. Amer.
[12] R. Schwartz, F. Kubala, O. Kimball, P. Price, and J. Makhoul, "Improving performance of phonetic hidden Markov models in a continuous speech recognition system," presented at the IEEE Workshop on Speech Recognition, 1988.

Index Terms:
Gaussian HMM speech recognition; speaker adaptation; hidden Markov models; phoneme-dependent spectral mapping; heuristic; covariance matrices; Markov processes; matrix algebra; spectral analysis; speech recognition
P. Kenny, M. Lennig, P. Mermelstein, "Speaker Adaptation in a Large-Vocabulary Gaussian HMM Recognizer," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 9, pp. 917-920, Sept. 1990, doi:10.1109/34.57686
Usage of this product signifies your acceptance of the Terms of Use.