Issue No. 04 - July/August (2002 vol. 6)
<p>Currently, the Web contains a large amount of interesting data implicitlyavailable on pages at various sites, including digital libraries and on-line stores. Researchers regard these data-rich pages as "data containers," because they contain useful, semistructured data. Such data is not readily available through conventional Web search tools, however, as it is typically identifiable only indirectly through visual clues such as colors, fonts, bullets, and indentations. Further, the underlying flexibility of both the con-tent and format creates structural variations and irregularities that challenge traditional data management systems. Even though data structuring standards such as XML are likely to gain in popularity, that fact does not address the existing (and still growing) volume of semistructured Web data available, for instance, on HTML pages.</p>
A. H. Laender, A. S. da Silva, P. B. Golgher, B. Ribeiro-Neto, I. M. Evangelista-Filha and K. V. Magalhães, "The Debye Environment for Web Data Management," in IEEE Internet Computing, vol. 6, no. , pp. 60-69, 2002.