Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
Barna Saha , Computer Science Department, University of Maryland College Park, AV Williams Building, US 20742
Ioana Stanoi , IBM Almaden Research Center, San Jose, CA, USA
Kenneth L. Clarkson , IBM Almaden Research Center, San Jose, CA, USA
We introduce schema covering, the problem of identifying easily understandable common objects for describing large and complex schemas. Defining transformations between schemas is a key objective in information integration. However, this process often becomes cumbersome when the schemas are large and structurally complex. If such complex schemas can be broken into smaller and simpler objects, then simple transformations defined over these smaller objects can be reused to define suitable transformations among the complex schemas. Schema covering performs this vital task by identifying a collection of common concepts from a repository and creating a cover of the complex schema by these concepts. In this paper, we formulate the problem of schema covering, show that it is NP-Complete, and give efficient approximation algorithms for it. A performance evaluation with real business schemas confirms the effectiveness of our approach.
Barna Saha, Ioana Stanoi, Kenneth L. Clarkson, "Schema covering: a step towards enabling reuse in information integration", ICDE, 2010, 2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013 IEEE 29th International Conference on Data Engineering (ICDE) 2010, pp. 285-296, doi:10.1109/ICDE.2010.5447853