Acoustics, Speech, and Signal Processing, IEEE International Conference on (1999)
Phoenix, AZ, USA
Mar. 15, 1999 to Mar. 19, 1999
ISBN: 0-7803-5041-3
pp: 397-400
R. Haeb-Umbach , Philips Res. Lab., Aachen, Germany
We apply Fisher variate analysis to measure the effectiveness of speaker normalization techniques. A trace criterion, which measures the ratio of the variations due to different phonemes compared to variations due to different speakers, serves as a first assessment of a feature set without the need for recognition experiments. By using this measure and by recognition experiments we demonstrate that cepstral mean normalization also has a speaker normalization effect, in addition to the well-known channel normalization effect. Similarly vocal tract normalization (VTN) is shown to remove inter-speaker variability. For VTN we show that normalization on a per sentence basis performs better than normalization on a per speaker basis. Recognition results are given on Wall Street Journal and Hub-4 databases.

