Issue No. 08 - August (2007 vol. 19)
With the vast amount of digitized textual materials now available on the Internet, it is almost impossible for people to absorb all pertinent information in a timely manner. To alleviate the problem, we present a novel approach for extracting hot topics from disparate sets of textual documents published in a given time period. Our technique consists of two steps. First, hot terms are extracted by mapping their distribution over time. Second, based on the extracted hot terms, key sentences are identified and then grouped into clusters that represent hot topics by using multidimensional sentence vectors. The results of our empirical tests show that this approach is more effective in identifying hot topics than existing methods.
Aging theory, clustering, hot topic detection, term weighting, topic detection and tracking.
Kuan-Yu Chen, Seng-cho T. Chou, Luesak Luesukprasert, "Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling", IEEE Transactions on Knowledge & Data Engineering, vol. 19, no. , pp. 1016-1025, August 2007, doi:10.1109/TKDE.2007.1040