Greenwich, London, U.K.
Sept. 6, 2000 to Sept. 8, 2000
V. Carchiolo , Istituto di Inf. e Telecommun., Catania Univ., Italy
A. Longheu , Istituto di Inf. e Telecommun., Catania Univ., Italy
M. Malgeri , Istituto di Inf. e Telecommun., Catania Univ., Italy
The WWW is a very large and rich information source but with no structure, so locating data of interest may be difficult. In particular a page may be divided into different logical sections of information, whose highlighting may improve both browsing and searching. We propose a simple Web page structuring, by introducing the "semantic block" as a more granular level to categorize information inside a page. We also propose a set of XML tags to be added to the existing HTML tags in order to locate such blocks and to use structured pages both with current and future, structure-aware browsers, reaching the goal of a gradual migration towards a more structured Web. We explore our technique on several Web sites, in order to detect which semantic blocks are needed, also using two simple Java-based tools we developed to add XML tags and manage such structure. Finally, we consider how schema can be represented for a better browsing.
hypermedia markup languages; WWW; Web page structuring; semantic block; XML tags; HTML tags; structure-aware browsers; Web sites; Java-based tools
V. Carchiolo, A. Longheu, M. Malgeri, "Structuring the Web", DEXA, 2000, 2012 23rd International Workshop on Database and Expert Systems Applications, 2012 23rd International Workshop on Database and Expert Systems Applications 2000, pp. 1123, doi:10.1109/DEXA.2000.875167