This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Cluster-Based FAQ Retrieval Using Latent Term Weights
March/April 2008 (vol. 23 no. 2)
pp. 58-65
Harksoo Kim, Kangwon National University
Jungyun Seo, Sogang University
To resolve lexical disagreement problems in FAQ retrieval, we propose a high-performance FAQ retrieval system using query-log clustering. The FAQ retrieval system is divided into two subsystems: a query-log clustering system and a cluster-based retrieval system. During indexing, the query-log clustering subsystem classifies the logs of users' queries into predefined FAQ categories using a dimensionality reduction technique called latent semantic analysis. Then, it groups the query logs according to the classification results. During retrieval, the cluster-based retrieval subsystem smoothes the FAQs using the query-log clusters. Then, it calculates the similarities between the users' queries and the smoothed FAQs. Using the cluster-based retrieval technique, the proposed system can partially bridge lexical chasms between users' queries and FAQs. In addition, the proposed system outperforms the traditional information retrieval systems in FAQ retrieval.

1. E. Sneiders, "Automated FAQ Answering: Continued Experience with Shallow Language Understanding," Papers from the 1999 AAAI Fall Symp., AAAI Press, 1999, pp. 97–107.
2. K. Hammond et al., "FAQ Finder: A Case-Based Approach to Knowledge Navigation," Proc. 11th Conf. Artificial Intelligence for Applications, 1995, pp. 80–86.
3. S.D. Whitehead, "Auto-FAQ: An Experiment in Cyberspace Leveraging," Computer Networks and ISDN Systems, vol. 28, nos. 1–2, 1995, pp. 137–146.
4. T.K. Landauer, P.W. Foltz, and D. Laham, "Introduction to Latent Semantic Analysis," Discourse Processes, vol. 25, 1998, pp. 259–284.
5. X. Liu and W.B. Croft, "Cluster-Based Retrieval Using Language Models," Proc. SIGIR2004, ACM Press, 2004, pp. 25–29.
6. P. Willet, "Recent Trends in Hierarchical Document Clustering: A Critical Review," Information Processing and Management, vol. 24, no. 5, 1988, pp. 577–597.
7. P. Hellwig, "Dependency Unification Grammar," Proc. 11th Int'l Conf. Computational Linguistics (COLING86), 1986, pp. 195–198.
8. S.E. Robertson and S. Walker, "Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval," Proc. SIGIR92, ACM Press, 1992, pp. 232–241.
9. S.E. Robertson et al., "Okapi at TREC-3," Proc. Text Retrieval Conf. (TREC-3), Nat'l Inst. Standards and Technology, 1994, pp. 109–126.
10. C. Zhai and J. Lafferty, "A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval," Proc. SIGIR2001, ACM Press, 2001, pp. 334–342.

Index Terms:
lexical disagreement problem, latent semantic analysis, query log clustering, FAQ smoothing, cluster-based FAQ retrieval
Citation:
Harksoo Kim, Jungyun Seo, "Cluster-Based FAQ Retrieval Using Latent Term Weights," IEEE Intelligent Systems, vol. 23, no. 2, pp. 58-65, March-April 2008, doi:10.1109/MIS.2008.23
Usage of this product signifies your acceptance of the Terms of Use.