The Community for Technology Leaders
Green Image
Issue No. 08 - Aug. (2017 vol. 29)
ISSN: 1041-4347
pp: 1751-1764
Marcin Wylot , Open Distributed Systems, TU Berlin/Fraunhofer FOKUS, Berlin, Germany
Philippe Cudre-Mauroux , eXascale Infolab, University of Fribourg, Fribourg, Switzerland
Manfred Hauswirth , Open Distributed Systems, TU Berlin/Fraunhofer FOKUS, Berlin, Germany
Paul Groth , Elsevier Labs, Amsterdam, NX, The Netherlands
ABSTRACT
The proliferation of heterogeneous Linked Data on the Web poses new challenges to database systems. In particular, the capacity to store, track, and query provenance data is becoming a pivotal feature of modern triplestores. We present methods extending a native RDF store to efficiently handle the storage, tracking, and querying of provenance in RDF data. We describe a reliable and understandable specification of the way results were derived from the data and how particular pieces of data were combined to answer a query. Subsequently, we present techniques to tailor queries with provenance data. We empirically evaluate the presented methods and show that the overhead of storing and tracking provenance is acceptable. Finally, we show that tailoring a query with provenance information can also significantly improve the performance of query execution.
INDEX TERMS
Resource description framework, W3C, Triples (Data structure), Query processing,RDF, linked data, triplestores, BigData, provenance
CITATION
Marcin Wylot, Philippe Cudre-Mauroux, Manfred Hauswirth, Paul Groth, "Storing, Tracking, and Querying Provenance in Linked Data", IEEE Transactions on Knowledge & Data Engineering, vol. 29, no. , pp. 1751-1764, Aug. 2017, doi:10.1109/TKDE.2017.2690299
179 ms
(Ver 3.3 (11022016))