Issue No. 03 - May/June (2005 vol. 20)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2005.38
Stanislaw Osinski , Poznan University of Technology
Dawid Weiss , Poznan University of Technology
Most search engines return search results in a single-dimensional ranking of relevance to a user's query. Although this method works well for specific information needs, it often fails when users submit broad, ambiguous queries, seeking a general cross-section of topics related to the query. Search result clustering has successfully served this purpose in both commercial and scientific systems. The proposed method separates search results (document references) into meaningful groups. Unlike previous clustering techniques that use some proximity measure between documents, this method tries to discover meaningful phrases that can become cluster descriptions and only then assign documents to those phrases to form clusters. This idea is the core of the Lingo algorithm, which combines common phrase discovery and latent semantic indexing techniques. Clusters created by Lingo are compared to those created by the classic suffix-tree clustering algorithm.
text analysis, clustering, Web search
S. Osinski and D. Weiss, "A Concept-Driven Algorithm for Clustering Search Results," in IEEE Intelligent Systems, vol. 20, no. , pp. 48-54, 2005.