|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
21st International Conference on Data Engineering (ICDE'05)
Corpus-Based Schema Matching
Tokyo, Japan
April 05-April 08
ISBN: 0-7695-2285-8
| ASCII Text | x | ||
| Jayant Madhavan, Philip A. Bernstein, AnHai Doan, Alon Halevy, "Corpus-Based Schema Matching," Data Engineering, International Conference on, pp. 57-68, 21st International Conference on Data Engineering (ICDE'05), 2005. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDE.2005.39, author = {Jayant Madhavan and Philip A. Bernstein and AnHai Doan and Alon Halevy}, title = {Corpus-Based Schema Matching}, journal ={Data Engineering, International Conference on}, volume = {0}, year = {2005}, issn = {1084-4627}, pages = {57-68}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2005.39}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Engineering, International Conference on TI - Corpus-Based Schema Matching SN - 1084-4627 SP57 EP68 A1 - Jayant Madhavan, A1 - Philip A. Bernstein, A1 - AnHai Doan, A1 - Alon Halevy, PY - 2005 KW - null VL - 0 JA - Data Engineering, International Conference on ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2005.39
Schema Matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform rather poorly due to the lack of sufficient evidence in the schemas being matched. In this paper we show how a corpus of schemas and mappings can be used to augment the evidence about the schemas being matched, so they can be matched better. Such a corpus typically contains multiple schemas that model similar concepts and hence enables us to learn variations in the elements and their properties. We exploit such a corpus in two ways. First, we increase the evidence about each element being matched by including evidence from similar elements in the corpus. Second, we learn statistics about elements and their relationships and use them to infer constraints that we use to prune candidate mappings. We also describe how to use known mappings to learn the importance of domain and generic constraints. We present experimental results that demonstrate corpus-based matching outperforms direct matching (without the benefit of a corpus) in multiple domains.
Citation:
Jayant Madhavan, Philip A. Bernstein, AnHai Doan, Alon Halevy, "Corpus-Based Schema Matching," icde, pp.57-68, 21st International Conference on Data Engineering (ICDE'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.
