Discovering Structural Association of Semistructured Data
May/June 2000 (vol. 12 no. 3)
pp. 353-371

Abstract—Many semistructured objects are similarly, though not identically, structured. We study the problem of discovering “typical” substructures of a collection of semistructured objects. The discovered structures can serve the following purposes: 1) the “table-of-contents” for gaining general information of a source, 2) a road map for browsing and querying information sources, 3) a basis for clustering documents, 4) partial schemas for providing standard database access methods, and 5) user/customer's interests and browsing patterns. The discovery task is impacted by structural features of semistructured data in a nontrivial way and traditional data mining frameworks are inapplicable. We define this discovery problem and propose a solution.

Index Terms:
Association rule, database, data mining, knowledge discovery, semistructured data, web mining.
Ke Wang, Huiqing Liu, "Discovering Structural Association of Semistructured Data," IEEE Transactions on Knowledge and Data Engineering, vol. 12, no. 3, pp. 353-371, May-June 2000, doi:10.1109/69.846290
