The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2014 vol.26)
pp: 3-15
Stratis D. Viglas , University of Edinburgh, Edinburgh
ABSTRACT
Multicore systems and multithreaded processing are now the de facto standards of enterprise and personal computing. If used in an uninformed way, however, multithreaded processing might actually degrade performance. We present the facets of the memory access bottleneck as they manifest in multithreaded processing and show their impact on query evaluation. We present a system design based on partition parallelism, memory pooling, and data structures conducive to multithreaded processing. Based on this design, we present alternative implementations of the most common query processing algorithms, which we experimentally evaluate using multiple scenarios and hardware platforms. Our results show that the design and algorithms are indeed scalable across platforms, but the choice of optimal algorithm largely depends on the problem parameters and underlying hardware. However, our proposals are a good first step toward generic multithreaded parallelism.
INDEX TERMS
Instruction sets, Hardware, Arrays, Query processing, Partitioning algorithms, Resource management, Context,multithreaded processors, Parallel databases, query processing, parallel algorithms, parallel processors
CITATION
Stratis D. Viglas, "A Comparative Study of Implementation Techniques for Query Processing in Multicore Systems", IEEE Transactions on Knowledge & Data Engineering, vol.26, no. 1, pp. 3-15, Jan. 2014, doi:10.1109/TKDE.2012.243
REFERENCES
[1] R. Acker et al., "Parallel Query Processing in Databases on Multicore Architectures," Proc. Eighth Int'l Conf. Algorithms and Architectures Parallel Processing (ICA3PP), 2008.
[2] D.A. Alcantara et al., "Real-Time Parallel Hashing on the GPU," Proc. ACM SIGGRAPH, 2009.
[3] E.D. Berger et al., "Hoard: A Scalable Memory Allocator for Multithreaded Applications," Proc. Ninth Int'l Conf. Architectural Support Programming Languages and Operating Systems (ASPLOS), 2000.
[4] S. Blanas et al., "Design and Evaluation of Main Memory Hash Join Algorithms for Multi-Core CPUs," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2011.
[5] R.D. Blumofe et al., "Cilk: An Efficient Multithreaded Runtime System," Proc. Fifth ACM SIGPLAN Symp. Principles and Practice Parallel Programming (PPoPP), 1995.
[6] L. Bouganim et al., "Dynamic Load Balancing in Hierarchical Parallel Database Systems," Proc. 22th Int'l Conf. Very Large Data Bases (VLDB), 1996.
[7] L. Bouganim et al., "Load Balancing for Parallel Query Execution on NUMA Multiprocessors," Distributed and Parallel Databases, vol. 7, no. 1, pp. 99-121, 1999.
[8] M.-S. Chen et al., "Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries," Proc. Eighth Int'l Conf. Data Eng. (ICDE), 1992.
[9] J. Chhugani et al., "Efficient Implementation of Sorting on Multi-Core SIMD CPU Architecture," Proc. VLDB Endowment, vol. 1, no. 2, pp. 1313-1324, 2008.
[10] J. Cieslewicz and K.A. Ross, "Adaptive Aggregation on Chip Multiprocessors," Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.
[11] M. Cole, Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, 1989.
[12] D.J. DeWitt, "The Wisconsin Benchmark: Past, Present, and Future," Benchmark Handbook for Database and Transaction Systems, Morgan Kaufmann, 1993.
[13] D.J. DeWitt and J. Gray, "Parallel Database Systems: The Future of High Performance Database Systems," Comm. ACM, vol. 35, no. 6, pp. 85-98, 1992.
[14] D.J. DeWitt et al., "The Gamma Database Machine Project," IEEE Trans. Knowledge Data Eng., vol. 2, no. 1, pp. 44-62, Mar. 1990.
[15] R. Fang et al., "GPUQP: Query Co-Processing Using Graphics Processors," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2007.
[16] P. Garcia and H.F. Korth, "Database Hash-Join Algorithms on Multithreaded Computer Architectures," Proc. Third Conf. Computing Frontiers (CF), 2006.
[17] P. Garcia and H.F. Korth, "Pipelined Hash-Join on Multithreaded Architectures," Proc. Third Int'l Workshop Data Management New Hardware (DaMoN), 2007.
[18] M.N. Garofalakis and Y.E. Ioannidis, "Multi-Dimensional Resource Scheduling for Parallel Queries," Proc. ACM SIGMOD Int'l Conf. Management of Data, 1996.
[19] M.N. Garofalakis and Y.E. Ioannidis, "Parallel Query Scheduling and Optimization with Time- and Space-Shared Resources," Proc. 23rd Int'l Conf. Very Large Data Bases (VLDB), 1997.
[20] G. Graefe, "Volcano - An Extensible and Parallel Query Evaluation System," IEEE Trans. Knowledge and Data Eng., vol. 6, no. 1, pp. 120-135, Feb. 1994.
[21] B. He et al., "Relational Query Coprocessing on Graphics Processors," ACM Trans. Database Systems, vol. 34, no. 4,article 21, 2009.
[22] R. Johnson et al., "To Share or Not to Share?" Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.
[23] C. Kim et al., "Sort Vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs," Proc. VLDB Endowment, vol. 2, no. 2, pp. 1378-1389, 2009.
[24] M. Kitsuregawa et al., "Application of Hash to Data Base Machine and Its Architecture," New Generation Computing, vol. 1, no. 1, pp. 63-74, 1983.
[25] K. Krikellas et al., "Modeling Multithreaded Query Execution on Chip Multiprocessors," Proc. Int'l Workshop Accelerating Data Management Systems Using Modern Processor and Storage Architectures (ADMS '10), 2010.
[26] K. Krikellas et al., "Scheduling Threads for Intra-Query Parallelism on Multicore Processors," Technical Report EDI-INF-RR-1345, Univ. of Edinburgh, 2010.
[27] R. Lee et al., "MCC-DB: Minimizing Cache Conflicts in Multi-Core Processors for Databases," Proc. VLDB Endowment, vol. 2, no. 1, pp. 373-384, 2009.
[28] B. Liu and E.A. Rundensteiner, "Revisiting Pipelined Parallelism in Multi-Join Query Processing," Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.
[29] M.-L. Lo et al., "On Optimal Processor Allocation to Support Pipelined Hash Joins," Proc. ACM SIGMOD Int'l Conf. Management of Data, 1993.
[30] S. Manegold, P. Boncz, and M.L. Kersten, "Generic Database Cost Models for Hierarchical Memory Systems," Proc. 28th Int'l Conf. Very Large Data Bases (VLDB), 2002.
[31] S. Manegold, M.L. Kersten, and P. Boncz, "Database Architecture Evolution: Mammals Flourished Long Before Dinosaurs Became Extinct," Proc. VLDB Endowment, vol. 2, pp. 1648-1653, 2009.
[32] R. Pagh and F.F. Rodler, "Cuckoo Hashing," J. Algorithms, vol. 51, pp. 122-144, 2004.
[33] L. Qiao et al., "Main-Memory Scan Sharing for Multi-Core CPUs," Proc. VLDB Endowment, vol. 1, pp. 610-621, 2008.
[34] J. Reinders, Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. O'Reilly, 2007.
[35] K.A. Ross and J. Cieslewicz, "Optimal Splitters for Database Partitioning with Size Bounds," Proc. Int'l Conf. Database Theory (ICDT), 2009.
[36] E.J. Shekita et al., "Multi-Join Optimization for Symmetric Multiprocessors," Proc. 19th Int'l Conf. Very Large Data Bases (VLDB), 1993.
[37] J.S. Vitter, "Random Sampling with a Reservoir," ACM Trans. Math. Software, vol. 11, pp. 37-57, 1985.
[38] F.M. Waas and J.M. Hellerstein, "Parallelizing Extensible Query Optimizers," Proc. ACM SIGMOD Int'l Conf. Management of Data, 2009.
[39] D. Xu, C. Wu, and P.-C. Yew, "On Mitigating Memory Bandwidth Contention through Bandwidth-Aware Scheduling," Proc. 19th Int'l Conf. Parallel Architectures and Compilation Techniques (PACT), 2010.
[40] J. Zhou et al., "Improving Database Performance on Simultaneous Multithreading Processors," Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.
41 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool