2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
Barna Saha , Computer Science Department, University of Maryland College Park, AV Williams Building, US 20742
Ioana Stanoi , IBM Almaden Research Center, San Jose, CA, USA
Kenneth L. Clarkson , IBM Almaden Research Center, San Jose, CA, USA
We introduce schema covering, the problem of identifying easily understandable common objects for describing large and complex schemas. Defining transformations between schemas is a key objective in information integration. However, this process often becomes cumbersome when the schemas are large and structurally complex. If such complex schemas can be broken into smaller and simpler objects, then simple transformations defined over these smaller objects can be reused to define suitable transformations among the complex schemas. Schema covering performs this vital task by identifying a collection of common concepts from a repository and creating a cover of the complex schema by these concepts. In this paper, we formulate the problem of schema covering, show that it is NP-Complete, and give efficient approximation algorithms for it. A performance evaluation with real business schemas confirms the effectiveness of our approach.
K. L. Clarkson, I. Stanoi and B. Saha, "Schema covering: a step towards enabling reuse in information integration," 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)(ICDE), Long Beach, CA, USA, 2010, pp. 285-296.