The Community for Technology Leaders
2013 IEEE 5th International Conference on Cloud Computing Technology and Science (2013)
Bristol, United Kingdom United Kingdom
Dec. 2, 2013 to Dec. 5, 2013
pp: 631-638
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration. Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.
Resource description framework, Sorting, Layout, Pattern matching, Educational institutions, Information management,MapReduce, Map-Side Merge Join, RDF, SPARQL
Martin Przyjaciel-Zablocki, Alexander Schaetzle, Eduard Skaley, Thomas Hornung, Georg Lausen, "Map-Side Merge Joins for Scalable SPARQL BGP Processing", 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, vol. 01, no. , pp. 631-638, 2013, doi:10.1109/CloudCom.2013.9
94 ms
(Ver 3.3 (11022016))