This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
The Piazza Peer Data Management System
July 2004 (vol. 16 no. 7)
pp. 787-798

Abstract—Intuitively, data management and data integration tools should be well-suited for exchanging information in a semantically meaningful way. Unfortunately, they suffer from two significant problems: They typically require a comprehensive schema design before they can be used to store or share information and they are difficult to extend because schema evolution is heavyweight and may break backward compatibility. As a result, many small-scale data sharing tasks are more easily facilitated by non-database-oriented tools that have little support for semantics. The goal of the peer data management system (PDMS) is to address this need: We propose the use of a decentralized, easily extensible data management architecture in which any user can contribute new data, schema information, or even mappings between other peers' schemas. PDMSs represent a natural step beyond data integration systems, replacing their single logical schema with an interlinked collection of semantic mappings between peers' individual schemas. This paper describes several aspects of the Piazza PDMS, including the schema mediation formalism, query answering and optimization algorithms, and the relevance of PDMSs to the Semantic Web.

[1] T. Berners-Lee, J. Hendler, and O. Lassila, The Semantic Web Scientific Am., May 2001.
[2] M. Dean, D. Connolly, F. van Harmelen, J. Hendler, I. Horrocks, D.L. McGuinness, P.F. Patel-Schneider, and L.A. Stein, OWL Web Ontology Language 1.0 Reference http://www.w3c.org/TR2002-WD-owl-ref-20020729 /, w3C working draft, 29 July 2002.
[3] H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom, The TSIMMIS Project: Integration of Heterogeneous Information Sources J. Intelligent Information Systems, vol. 8, no. 2, pp. 117-132, Mar. 1997.
[4] L. Haas, D. Kossmann, E. Wimmers, and J. Yang, Optimizing Queries across Diverse Data Sources Proc. 23 Int'l Conf. Very Large Databases, 1997.
[5] S. Adali, K. Candan, Y. Papakonstantinou, and V. Subrahmanian, Query Caching and Optimization in Distributed Mediator Systems Proc. SIGMOD, pp. 137-148, 1996.
[6] A.Y. Levy, A. Rajaraman, and J.J. Ordille, Querying Heterogeneous Information Sources Using Source Descriptions Proc. 22nd Int'l Conf. Very Large Databases, pp. 251-262, 1996.
[7] O.M. Duschka and M.R. Genesereth, Answering Recursive Queries Using Views Proc. 16th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 109-116, 1997.
[8] I. Manolescu, D. Florescu, and D. Kossmann, Answering XML Queries on Heterogeneous Data Sources Proc. 27th Int'l Conf. Very Large Data Bases, pp. 241-250, 2001.
[9] J.L. Ambite, N. Ashish, G. Barish, C.A. Knoblock, S. Minton, P.J. Modi, I. Muslea, A. Philpot, and S. Tejada, ARIADNE: A System for Constructing Mediators for Internet Sources (System Demonstration) Proc. SIGMOD, 1998.
[10] E. Lambrecht, S. Kambhampati, and S. Gnanaprakasam, Optimizing Recursive Information Gathering Plans Proc. 16th Int'l Joint Conf. Artificial Intelligence, pp. 1204-1211, 1999.
[11] J.M. Smith, P.A. Bernstein, U. Dayal, N. Goodman, T. Landers, K. Lin, and E. Wong, Multibase Integrating Heterogeneous Distributed Database Systems Proc. Nat'l Computer Conf., 1981.
[12] A.Y. Halevy, Answering Queries Using Views: A Survey VLDB J., vol. 10, no. 4, 2001.
[13] D. Draper, A.Y. Halevy, and D.S. Weld, The Nimble Integration System Proc. SIGMOD, 2001.
[14] S. Abiteboul and O. Duschka, Complexity of Answering Queries Using Materialized Views Proc. 17th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 254-263, 1998.
[15] A. Halevy, Z. Ives, D. Suciu, and I. Tatarinov, Schema Mediation for Large-Scale Semantic Data Sharing VLDB J., 2003.
[16] M. Friedman, A. Levy, and T. Millstein, Navigational Plans for Data Integration Proc. 16th Nat'l Conf. Artificial Intelligence, 1999.
[17] Z.G. Ives, A.Y. Halevy, and D.S. Weld, Integrating Network-bound XML Data IEEE Data Eng. Bull., special issue on XML, vol. 24, no. 2, June 2001.
[18] A.Y. Halevy, I. Mumick, Y. Sagiv, and O. Shmueli, Static Analysis in Datalog Extensions J. ACM, vol. 48, no. 5, pp. 971-1012, Sept. 2001.
[19] D. Srivastava and R. Ramakrishnan, Pushing Constraint Selections Proc. 11th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 301-315, 1992.
[20] J. Madhavan and A. Halevy, Composing Mappings among Data Sources Proc. 20th Int'l Conf. Very Large Data Bases, 2003.
[21] M.Y. Vardi, On the Complexity of Bounded-Variable Queries Proc. 14th ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems, pp. 266-276, 1995.
[22] I. Stoica, R. Morris, D. Karger, M.F. Kaashoek, and H. Balakrishnan, Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Proc. ACM SIGCOMM, 2001.
[23] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, A Scalable Content-Addressable Network Proc. ACM SIGCOMM, 2001.
[24] A. Halevy, Z. Ives, P. Mork, and I. Tatarinov, Piazza: Data Management Infrastructure for Semantic Web Applications Proc. 12th Int'l World Wide Web Conf., 2003.
[25] D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanascu, XQuery: A Query Language for XML World Wide Web Consortium, technical report, Feb. 2001, available fromhttp://wwww.w3.org/TRxquery/.
[26] A. Deutsch, M. Fernandez, D. Florescu, and A. Levy, D. Suciu, A Query Language for XML http://www.research.att.com/mff/xmlw3c-note.html , 1998.
[27] P. Patel-Schneider and J. Simeon, Building the Semantic Web on XML Proc. Int'l Semantic Web Conf., June 2002.
[28] I. Horrocks, F. van Harmelen, and P. Patel-Schneider, DAML+OIL http://www.daml.org/2001/03daml+oil-in dex.html , Mar. 2001.
[29] A. Levy and M.-C. Rousset, Combining Horn Rules and Description Logics in Carin Artificial Intelligence, vol. 104, pp. 165-209, Sept. 1998.
[30] W. Litwin, L. Mark, and N. Roussopoulos, Interoperability of Multiple Autonomous Databases ACM Computing Surveys, vol. 22, no. 3, pp. 267-293, 1990.
[31] R. Krishnamurthy, W. Litwin, and W. Kent, Language Features for Interoperability of Databases with Schematic Discrepancies Proc. SIGMOD, pp. 40-49, 1991.
[32] M. Rusinkiewicz, A. Sheth, and G. Karabatis, “Specifying Interdatabase Dependencies in a Multidatabase Environment,” IEEE Computer, vol. 24, no. 12, pp. 46-53, Dec. 1991.
[33] T. Catarci and M. Lenzerini, Representing and Using Interschema Knowledge in Cooperative Information Systems J. Intelligent and Cooperative Information Systems, pp. 55-62, 1993.
[34] S. Gribble, A. Halevy, Z. Ives, M. Rodrig, and D. Suciu, What Can Databases Do for Peer-to-Peer? Proc. ACM SIGMOD WebDB Workshop, 2001.
[35] P. Kalnis, W. Ng, B. Ooi, D. Papadias, and K. Tan, An Adaptive Peer-to-Peer Network for Distributed Caching of Olap Results Proc. SIGMOD, 2002.
[36] P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu, Data Management for Peer-to-Peer Computing: A Vision Proc. WebDB Workshop, 2002.
[37] W. Nejdl, B. Wolf, C. Qu, S. Decker, M. Sintek, A. Naeve, M. Nilsson, M. Palmer, and T. Risch, EDUTELLA: A P2P Networking Infrastructure Based on RDF Proc. Int'l WWW Conf., 2002.
[38] K. Aberer, P. Cudre-Mauroux, and M. Hauswirth, The Chatty Web: Emergent Semantics through Gossiping Proc. 12th Int'l World Wide Web Conf., 2003.
[39] M. Arenas, V. Kantere, A. Kementsietsidis, I. Kiringa, R.J. Miller, and J. Mylopoulos, The Hyperion Project: From Data Integration to Data Coordination SIGMOD Record, Sept. 2003.
[40] W.S. Ng, B.C. Ooi, K.-L. Tan, and A. Zhou, PeerDB: A P2P-Based System for Distributed Data Sharing Proc. SIGMOD, 2003.
[41] E. Mena, V. Kashyap, A.P. Sheth, and A. Illarramendi, OBSERVER: An Approach for Query Processing in Global Information Systems Based on Interoperation across Pre-Existing Ontologies Distributed and Parallel Databases, vol. 8, no. 2, pp. 223-271, 2000.
[42] A. Preece, K. Hui, and P. Gray, Kraft: An Agent Architecture for Knowledge Fusion Int'l J. Cooperative Information Systems, vol. 10, nos. 1-2, pp. 171-195, 1999.

Index Terms:
Peer data management, data integration, schema mediation, Web, databases.
Citation:
Alon Y. Halevy, Zachary G. Ives, Jayant Madhavan, Peter Mork, Dan Suciu, Igor Tatarinov, "The Piazza Peer Data Management System," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 7, pp. 787-798, July 2004, doi:10.1109/TKDE.2004.1318562
Usage of this product signifies your acceptance of the Terms of Use.