DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2008.23
To resolve lexical disagreement problems in FAQ retrieval, we propose a high-performance FAQ retrieval system using query-log clustering. The FAQ retrieval system is divided into two subsystems: a query-log clustering system and a cluster-based retrieval system. During indexing, the query-log clustering subsystem classifies the logs of users' queries into predefined FAQ categories using a dimensionality reduction technique called latent semantic analysis. Then, it groups the query logs according to the classification results. During retrieval, the cluster-based retrieval subsystem smoothes the FAQs using the query-log clusters. Then, it calculates the similarities between the users' queries and the smoothed FAQs. Using the cluster-based retrieval technique, the proposed system can partially bridge lexical chasms between users' queries and FAQs. In addition, the proposed system outperforms the traditional information retrieval systems in FAQ retrieval. 1. E. Sneiders, "Automated FAQ Answering: Continued Experience with Shallow Language Understanding," Papers from the 1999 AAAI Fall Symp., AAAI Press, 1999, pp. 97–107.
Index Terms:
lexical disagreement problem, latent semantic analysis, query log clustering, FAQ smoothing, cluster-based FAQ retrieval
Citation:
Harksoo Kim, Jungyun Seo, "Cluster-Based FAQ Retrieval Using Latent Term Weights," IEEE Intelligent Systems, vol. 23, no. 2, pp. 58-65, Mar./Apr. 2008, doi:10.1109/MIS.2008.23 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||