Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on (2011)
Aug. 22, 2011 to Aug. 27, 2011
We put forward a hypothesis that if there is a link from one page to another, it is likely that comprehensibility of the two pages is similar. To investigate whether this hypothesis is true or not, we conduct experiments using existing readability measures. We investigate the relationship between links and readability of text extracted from web pages for two datasets, set of English and Japanese pages. We could find that links and readability of text extracted from web pages are correlated. Based on the hypothesis, we propose a link analysis algorithm to measure comprehensibility of web pages. Our method is based on the Trust Rank algorithm which is originally used for combating web spam. We use link structure to propagate readability scores from source pages selected based on their comprehensibility. The results of experimental evaluation demonstrate that our method could improve estimation of comprehensibility of pages.
comprehensibility, link analysis, readability
K. Akamatsu, N. Pattanasri, K. Tanaka and A. Jatowt, "Measuring Comprehensibility of Web Pages Based on Link Analysis," 2011 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies(WI-IAT), Lyon, 2011, pp. 40-46.