Proceedings of the 1999 ACM/IEEE conference on Supercomputing Querying Very Large Multi-dimensional Datasets in ADR Portland, Oregon, USA November 13-November 18 ISBN: 1-58113-091-0
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SC.1999.10046
Applications that make use of very large scientific datasets have become an increasingly important subset of scientific applications. In these applications, datasets are often multi-dimensional, i.e., data items are associated with points in a multi-dimensional attribute space, and access to data items is described by range queries. The basic processing involves mapping input data items to output data items, and some form of aggregation of all the input data items that project to the each output data item. We have developed an infrastructure, called the Active Data Repository (ADR), that integrates storage, retrieval and processing of multi-dimensional datasets on distributed-memory parallel architectures with multiple disks attached to each node. In this paper we address efficient execution of range queries on distributed memory parallel machines within ADR framework. We present three potential strategies, and evaluate them under different application scenarios and machine configurations. We present experimental results on the scalability and performance of the strategies on a 128-node IBM SP.
Citation:
Tahsin Kurc, Chialin Chang, Renato Ferreira, Alan Sussman, Joel Saltz, "Querying Very Large Multi-dimensional Datasets in ADR," sc, pp.12, Proceedings of the 1999 ACM/IEEE conference on Supercomputing, 1999 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||