International Workshop on Challenges in Web Information Retrieval and Integration Building A Document Class Hierarchy for Obtaining More Proper Bibliographies from Web Tokyo, Japan April 08-April 09 ISBN: 0-7695-2414-1
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/WIRI.2005.13
In order for researchers in scientific and technological fields to find more proper information resources on Web, an auxiliary search structure is proposed, which is a class hierarchy of documents built based on the keywords of the documents. To cover the contents of the document properly, the keywords are extracted by means of mining maximal sequential frequent phrases. In this paper, the concept of maximal sequential frequent phrase is defined, and the corresponding mining algorithm is designed and implemented. The experiments show that keywords extraction using maximal sequential frequent phrase has better F-Measure than that of using traditional TFIDF weight. Moreover, compared with previous works, our extended class hierarchy tree represents a relationship hierarchy either between keywords themselves or between keywords and documents, by which the queries on different professional levels can be supported.
Citation:
Daling Wang, Ge Yu, Minghan Hu, Yubin Bao, Meng Zhang, "Building A Document Class Hierarchy for Obtaining More Proper Bibliographies from Web," wiri, pp.214-219, International Workshop on Challenges in Web Information Retrieval and Integration, 2005 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||