The Community for Technology Leaders
Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008) (2008)
Waikoloa, Big Island, Hawaii
Jan. 7, 2008 to Jan. 10, 2008
ISSN: 1530-1605
ISBN: 0-7695-3075-3
pp: 133
ABSTRACT
In this paper, we describe a set of experiments to examine the effect of various attributes of web genre on the automatic identification of the genre of web pages. Four different genres are used in the data set, namely, FAQ, News, E-Shopping and Personal Home Pages. The effects of the number of features used to represent the web pages (5, 20, or 100) as well as the types of attributes, content, form, functionality, singly and in various combinations are examined. The results indicate that fewer features produce better precision but more features produce better recall, and that attributes in combinations will always perform better than single attributes.
INDEX TERMS
CITATION

J. Duffy, C. Watters, M. Shepherd and L. Dong, "An Examination of Genre Attributes for Web Page Classification," Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008)(HICSS), Waikoloa, Big Island, Hawaii, 2008, pp. 133.
doi:10.1109/HICSS.2008.53
93 ms
(Ver 3.3 (11022016))