The Community for Technology Leaders
RSS Icon
Subscribe
Tokyo
April 8, 2005 to April 9, 2005
ISBN: 0-7695-2414-1
pp: 104-112
Yusuke Suzuki , Department of Informatics, Kyushu University, Kasuga 816-8580, Japan
Tetsuhiro Miyahara , Faculty of Information Sciences, Hiroshima City University
Takayoshi Shoudai , Faculty of Information Sciences, Hiroshima City University
Tomoyuki Uchida , Faculty of Information Sciences, Hiroshima City University
Yasuaki Nakamura , Faculty of Information Sciences, Hiroshima City University
ABSTRACT
<p>In order to realize Web information retrieval using characteristic tree structured patterns in semistructured Web documents, methods for discovering frequent patterns or common characteristics in semistructured documents become more and more important. We have studied methods for discovering maximally frequent tree structured patterns in semistructured Web documents. A tag tree pattern is an edge labeled tree with ordered children and structured variables. An edge label of a tag tree pattern is a tag or a keyword in Web documents, or a wildcard for any string. Each variable, which matches any subtree, represents a field of a Web document. A tag tree pattern is much more powerful than a usual tree structured pattern. In order to represent tree structured patterns with rich structural features, we introduce a new kind of variables, called height-constrained variables. An </p>
INDEX TERMS
null
CITATION
Yusuke Suzuki, Tetsuhiro Miyahara, Takayoshi Shoudai, Tomoyuki Uchida, Yasuaki Nakamura, "Discovery of Maximally Frequent Tag Tree Patterns with Height-Constrained Variables from Semistructured Web Documents", WIRI, 2005, Proceedings. International Workshop on Challenges in Web Information Retrieval and Integration, Proceedings. International Workshop on Challenges in Web Information Retrieval and Integration 2005, pp. 104-112, doi:10.1109/WIRI.2005.40
19 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool