2016 International Conference on Cyberworlds (CW) (2016)
Sept. 28, 2016 to Sept. 30, 2016
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CW.2016.30
HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift"-a deviation between search and topic-would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree.
Algorithm design and analysis, Web pages, Symmetric matrices, Computers, Crawlers, Computational efficiency, Correlation
W. Yang, "An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity," 2016 International Conference on Cyberworlds (CW), Chongqing, China, 2016, pp. 147-150.