This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2009 WRI World Congress on Computer Science and Information Engineering
Language Model Based on Word Order Sensitive Matrix Representation in Latent Semantic Analysis for Speech Recognition
Los Angeles, California USA
March 31-April 02
ISBN: 978-0-7695-3507-4
This paper investigates matrix representation in latent semantic analysis (LSA) framework for a language model. In LSA, word-document matrix is usually used to represent a corpus. However, this matrix ignores word order in the sentence. We propose several word co-occurrence matrices that keep word order to use in LSA. To support this matrix, we define a context dependent class (CDC) language model, which distinguishes classes according to their context in the sentences. Experiments on Wall Street Journal (WSJ) corpus show that the proposed method achieves better performance than the original LSA with word-document matrix.
Index Terms:
Language model, Latent semantic analysis, Word co-occurrence matrix
Citation:
Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa, "Language Model Based on Word Order Sensitive Matrix Representation in Latent Semantic Analysis for Speech Recognition," csie, vol. 7, pp.252-256, 2009 WRI World Congress on Computer Science and Information Engineering, 2009
Usage of this product signifies your acceptance of the Terms of Use.