2012 15th International Conference on Network-Based Information Systems (2012)
Melbourne, Australia Australia
Sept. 26, 2012 to Sept. 28, 2012
ISBN: 978-1-4673-2331-4
pp: 32-37
The profusion of unstructured data forced organizations to manage and take advantage of such data especially in the decision making process. The feasibility of integrating or mapping unstructured data to a data warehouse is becoming significant to bridge this gap and take the full potential of these data. In this paper, we propose a multi-layer schema for mapping structured data stored in a data warehouse and unstructured data in business-related documents. The multi-layer schema facilitates the mapping between the two different data. Linguistically correlated data is identified using Word Net to enable the integration between both data sources. We also propose a generic XML schema for business-related unstructured documents to assist the mapping. The use Word Net to identify the matching result is promising in the absence of schema-instance and without the need to domain specific knowledge.
Data warehouses, XML, Semantics, Organizations, Data mining, Data models, schema mapping, unstructured document, data warehouse, data integeration, XML schema matching

