Hong Kong, SAR
July 16, 2003 to July 18, 2003
Jingren Zhou , Columbia University
We propose a new storage model called MBSM (Multiresolution Block Storage Model) for laying out tables on disks. MBSM is intended to speed up operations such as scans that are typical of data warehouse workloads. Disk blocks are grouped into "super-blocks," with a single record stored in a partitioned fashion among the blocks in a superblock. The intention is that a scan operation that needs to consult only a small number of attributes can access just those blocks of each super-block that contain the desired attributes. To achieve good performance given the physical characteristics of modern disks, we organize super-blocks on the disk into .xed-size "mega-blocks." Within a megablock, blocks of the same type (from various super-blocks) are stored contiguously. We describe the changes needed in a conventional database system to manage tables using such a disk organization. We demonstrate experimentally that MBSM outperforms competing approaches such as NSM (N-ary Storage Model), DSM (Decomposition Storage Model) and PAX (Partition Attributes Across), for I/O bound decision-support workloads consisting of scans in which not all attributes are required. This improved performance comes at the expense of single-record insert and delete performance; we quantify the trade-offs involved. Unlike DSM, the cost of reconstructing a record from its partitions is small. MBSM stores attributes in a vertically partitioned manner similar to PAX, and thus shares PAX?s good CPU cache behavior. We describe methods for mapping attributes to blocks within super-blocks in order to optimize overall performance, and show how to tune the super-block and mega-block sizes.
Jingren Zhou, "A Multi-Resolution Block Storage Model for Database Design", IDEAS, 2003, Database Engineering and Applications Symposium, International, Database Engineering and Applications Symposium, International 2003, pp. 22, doi:10.1109/IDEAS.2003.1214908