Issue No. 02 - March/April (2008 vol. 23)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2008.23
Harksoo Kim , Kangwon National University
Jungyun Seo , Sogang University
To resolve lexical disagreement problems in FAQ retrieval, we propose a high-performance FAQ retrieval system using query-log clustering. The FAQ retrieval system is divided into two subsystems: a query-log clustering system and a cluster-based retrieval system. During indexing, the query-log clustering subsystem classifies the logs of users' queries into predefined FAQ categories using a dimensionality reduction technique called latent semantic analysis. Then, it groups the query logs according to the classification results. During retrieval, the cluster-based retrieval subsystem smoothes the FAQs using the query-log clusters. Then, it calculates the similarities between the users' queries and the smoothed FAQs. Using the cluster-based retrieval technique, the proposed system can partially bridge lexical chasms between users' queries and FAQs. In addition, the proposed system outperforms the traditional information retrieval systems in FAQ retrieval.
lexical disagreement problem, latent semantic analysis, query log clustering, FAQ smoothing, cluster-based FAQ retrieval
H. Kim and J. Seo, "Cluster-Based FAQ Retrieval Using Latent Term Weights," in IEEE Intelligent Systems, vol. 23, no. , pp. 58-65, 2008.