|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2011 IEEE International Conference on Multimedia and Expo
Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition
Barcelona
July 11-July 15
ISBN: 978-1-61284-348-3
| ASCII Text | x | ||
| Chao Zhang, Yi Liu, Yunqing Xia, Thomas Fang Zheng, Jesper Olsen, JiLei Tian, "Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition," 2012 IEEE International Conference on Multimedia and Expo, pp. 1-6, 2011 IEEE International Conference on Multimedia and Expo, 2011. | |||
| BibTex | x | ||
| @article{ 10.1109/ICME.2011.6011941, author = { Chao Zhang and Yi Liu and Yunqing Xia and Thomas Fang Zheng and Jesper Olsen and JiLei Tian}, title = {Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition}, journal ={2012 IEEE International Conference on Multimedia and Expo}, volume = {0}, year = {2011}, isbn = {978-1-61284-348-3}, pages = {1-6}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICME.2011.6011941}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - 2012 IEEE International Conference on Multimedia and Expo TI - Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition SN - 978-1-61284-348-3 SP1 EP6 A1 - Chao Zhang, A1 - Yi Liu, A1 - Yunqing Xia, A1 - Thomas Fang Zheng, A1 - Jesper Olsen, A1 - JiLei Tian, PY - 2011 VL - 0 JA - 2012 IEEE International Conference on Multimedia and Expo ER - | |||
Multiple accents are often present in Mandarin speech, as most Chinese have learned Mandarin as a second language. We propose generating reliable accent specific unit together with dynamic Gaussian mixture selection for multi-accent speech recognition. Time alignment phoneme recognition is used to generate such unit and to model accent variations explicitly and accurately. Dynamic Gaussian mixture selection scheme builds a dynamical observation density for each specified frame in decoding, and leads to use Gaussian mixture component efficiently. This method increases the covering ability for a diversity of accent variations in multi-accent, and alleviates the performance degradation caused by pruned beam search without augmenting the model size. The effectiveness of this approach is evaluated on three typical Chinese accents Chuan, Yue and Wu. Our approach outperforms traditional acoustic model reconstruction approach significantly by 6.30%, 4.93% and 5.53%, respectively on Syllable Error Rate (SER) reduction, without degrading on standard speech.
Citation:
Chao Zhang, Yi Liu, Yunqing Xia, Thomas Fang Zheng, Jesper Olsen, JiLei Tian, "Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition," icme, pp.1-6, 2011 IEEE International Conference on Multimedia and Expo, 2011
Usage of this product signifies your acceptance of the Terms of Use.
