Issue No.12 - December (2011 vol.23)
Mohamed E. Khalefa , University of Minnesota, Minneapolis
Justin J. Levandoski , University of Minnesota, Minneapolis
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.182
This paper introduces an efficient framework for producing high and early result throughput in multijoin query plans. While most previous research focuses on optimizing for cases involving a single join operator, this work takes a radical step by addressing query plans with multiple join operators. The proposed framework consists of two main methods, a flush algorithm and operator state manager. The framework assumes a symmetric hash join, a common method for producing early results, when processing incoming data. In this way, our methods can be applied to a group of previous join operators (optimized for single-join queries) when taking part in multijoin query plans. Specifically, our framework can be applied by 1) employing a new flushing policy to write in-memory data to disk, once memory allotment is exhausted, in a way that helps increase the probability of producing early result throughput in multijoin queries, and 2) employing a state manager that adaptively switches operators in the plan between joining in-memory data and disk-resident data in order to positively affect the early result throughput. Extensive experimental results show that the proposed methods outperform the state-of-the-art join operators optimized for both single and multijoin query plans.
Database management, systems, query processing.
Mohamed E. Khalefa, Justin J. Levandoski, "On Producing High and Early Result Throughput in Multijoin Query Plans", IEEE Transactions on Knowledge & Data Engineering, vol.23, no. 12, pp. 1888-1902, December 2011, doi:10.1109/TKDE.2010.182