|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Julian Odell, Kunal Mukerjee, "Architecture, User Interface, and Enabling Technology in Windows Vista's Speech Systems," IEEE Transactions on Computers, vol. 56, no. 9, pp. 1156-1168, September, 2007. | |||
| BibTex | x | ||
| @article{ 10.1109/TC.2007.1065, author = {Julian Odell and Kunal Mukerjee}, title = {Architecture, User Interface, and Enabling Technology in Windows Vista's Speech Systems}, journal ={IEEE Transactions on Computers}, volume = {56}, number = {9}, issn = {0018-9340}, year = {2007}, pages = {1156-1168}, doi = {http://doi.ieeecomputersociety.org/10.1109/TC.2007.1065}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Computers TI - Architecture, User Interface, and Enabling Technology in Windows Vista's Speech Systems IS - 9 SN - 0018-9340 SP1156 EP1168 EPD - 1156-1168 A1 - Julian Odell, A1 - Kunal Mukerjee, PY - 2007 KW - Adaptation KW - Operating Systems KW - Speech recognition and synthesis KW - User interfaces VL - 56 JA - IEEE Transactions on Computers ER - | |||
[1] S. Burger, Z.A. Sloane, and J. Yang, “Competitive Evaluation of Commercially Available Speech Recognizers in Multiple Languages,” Proc. Language Resource and Evaluation Conf. (LREC '06), 2006.
[2] Discontinued Products Information in the Comp. Speech FAQ, http://www.speech.cs.cmu.edu/comp.speech FAQ6.html, 2007.
[3] Detailed Product Information for Both Products, http://www.nuance.comproducts, 2007.
[4] X. Huang, A. Acero, F. Alleva, M.Y. Hwang, L. Jiang, and M. Mahajan, “Microsoft Windows Highly Intelligent Speech Recognizer: Whisper,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '95), May 1995.
[5] L. Deng and X. Huang, “Challenges in Adopting Speech Recognition,” Comm. ACM, vol. 47, no. 1, pp. 69-75, Jan. 2004.
[6] X. Huang, A. Acero, and H. Hon, Spoken Language Processing. Prentice Hall, 2001.
[7] X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, and R. Rosenfeld, “The SPHINX-II Speech Recognition System: An Overview,” Technical Report CMU-CS-92-112, Carnegie Mellon Univ., Jan. 1992.
[8] M. Rozak, “Talk to Your Computer and Have It Answer Back with the Microsoft Speech API,” Microsoft Systems J., Jan. 1996.
[9] X. Huang, A. Acero, F. Alleva, M. Hwang, L. Jiang, and M. Mahajan, “From Sphinx-II to Whisper: Making Speech Recognition Usable,” Automatic Speech and Speaker Recognition, Advanced Topics, C. Lee, F. Soong, and K. Paliwal, eds., Kluwer Academic, 1996.
[10] SAPI Information from the Microsoft Speech Site, http://www.microsoft.com/speech/download/ oldsapi5.asp, 2007.
[11] S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, “The HTK Book Version 2.2,” Entropic Cambridge Research Laboratory, Dec. 1999.
[12] P.C. Woodland, J.J. Odell, V. Valtchev, and S.J. Young, “Large Vocabulary Continuous Speech Recognition Using HTK,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '94), vol. 2, pp. 125-128, Apr. 1994.
[13] P.C. Woodland, T. Hain, S.E. Johnson, T.R. Niesler, A. Tuerk, E.W.D. Whittaker, and S.J. Young, “The 1997 HTK Broadcast News Transcription System,” Proc. DARPA Broadcast News Transcription and Understanding Workshop, pp. 41-48, 1998.
[14] J.J. Odell, P.C. Woodland, and T. Hain, “The CUHTKEntropic 10xRT Broadcast News Transcription System,” Proc. DARPA Broadcast News Workshop, pp. 271-275, 1999.
[15] T. Hain, P.C. Woodland, T.R. Niesler, and E.W.D. Whittaker, “The 1998 HTK System for Transcription of Conversational Telephone Speech,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '99), pp. 57-60, 1999.
[16] P.C. Woodland, T. Hain, G. Evermann, and D. Povey, “CU-HTK March 2001 Hub5 System,” Proc. Large Vocabulary Continuous Speech Recognition Hub5 Workshop, May 2001.
[17] P.C. Woodland, H.Y. Chan, G. Evermann, M.J.F. Gales, D.Y. Kim, X.A. Liu, D. Mrva, K.C. Sim, L. Wang, K. Yu, J. Makhoul, R. Schwartz, L. Nguyen, S. Matsoukas, B. Xiang, M. Afify, S. Abdou, J.-L. Gauvain, L. Lamel, H. Schwenk, G. Adda, F. Lefevre, D. Vergyri, W. Wang, J. Zheng, A. Venkataraman, R.R. Gadde, and A. Stolcke, “SuperEARS: Multi-Site Broadcast News System,” Proc. Fall Rich Transcription Workshop (RT '04), Nov. 2004.
[18] M.J.F. Gales, Y.K. Do, P.C. Woodland, Y.C. Ho, D. Mrva, R. Sinha, and S.E. Tranter, “Progress in the CU-HTK Broadcast News Transcription System,” IEEE Trans. Audio, Speech and Language Processing, vol. 14, no. 5, pp. 1513-1525, Sept. 2006.
[19] J.V. West, Tablet PC Quick Reference. Microsoft Press, 2002.
[20] D. Klementiev, “Software Driving Software: Active Accessibility-Compliant Apps Give Programmers New Tools to Manipulate Software,” MSDN Magazine, Apr. 2000.
[21] Developing International Software, second ed., chapter 23. Microsoft Press, 2003.
[22] D. Yu, L. Deng, X. He, and A. Acero, “Use of Incrementally Regulated Discriminative Margins in MCE Training for Speech Recognition,” Proc. Interspeech Conf., Sept. 2006.
[23] R.P. Lippmann, “Speech Recognition by Machines and Humans,” Speech Comm., vol. 22, pp. 1-15, 1997.
[24] S.B. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357-366, 1980.
[25] Documentation for .Net Framework, http://msdn.microsoft.com/en-us/netframework default.aspx, Aug. 2006.
[26] “Speech Recognition Grammar Specification (SRGS) v1.0” and “Speech Synthesis Markup Language (SSML) v1.0,” World Wide Web Consortium (W3C) recommendation, 2004.
[27] MSDN, “What Is the Indexing Service,” MSDN Library Platform SDK, Aug. 2006.
[28] R.C. Dorf, Modern Control Systems. Addison-Wesley, 1992.
[29] D. Yu, M. Mahajan, P. Mau, and A. Acero, “Maximum Entropy Based Generic Filter for Language Model Adaptation,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing (ICASSP '05), Mar. 2005.
[30] C.J. Leggetter, “Improved Acoustic Modeling for HMMs Using Linear Transformations,” PhD dissertation, Dept. of Eng., Univ. of Cambridge, Feb. 1995.
[31] J.L. Gauvain and C.H. Lee, “Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains,” IEEE Trans. Speech and Audio Processing, vol. 2, pp. 291-298, 1994.

