The Community for Technology Leaders
Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on (2011)
Lyon, France
Aug. 22, 2011 to Aug. 27, 2011
ISBN: 978-0-7695-4513-4
pp: 40-46
We put forward a hypothesis that if there is a link from one page to another, it is likely that comprehensibility of the two pages is similar. To investigate whether this hypothesis is true or not, we conduct experiments using existing readability measures. We investigate the relationship between links and readability of text extracted from web pages for two datasets, set of English and Japanese pages. We could find that links and readability of text extracted from web pages are correlated. Based on the hypothesis, we propose a link analysis algorithm to measure comprehensibility of web pages. Our method is based on the Trust Rank algorithm which is originally used for combating web spam. We use link structure to propagate readability scores from source pages selected based on their comprehensibility. The results of experimental evaluation demonstrate that our method could improve estimation of comprehensibility of pages.
comprehensibility, link analysis, readability

