14th International Parallel and Distributed Processing Symposium (IPDPS'00) Optimizing Retrieval and Processing of Multi-Dimensional Scientific Datasets Cancun, Mexico May 01-May 05 ISBN: 0-7695-0574-0
We have developed the Active Data Repository (ADR), an infrastructure that integrates storage, retrieval, and processing of large multi-dimensional scientific datasets on distributed memory parallel machines with multiple disks attached to each node. In earlier work, we proposed three strategies for processing range queries within the ADR framework. Our experimental results show that the relative performance of the strategies changes under varying application characteristics and machine configurations.In this work, we investigate approaches to guide and automate the selection of the best strategy for a given application and machine configuration. We describe analytical models to predict the relative performance of the strategies when input data elements are uniformly distributed in the attribute space of the output dataset, restricting the output dataset to be a regular d-dimensional array.
Index Terms:
High-performance computing, Data-intensive applications, Performance evaluation and optimization, Performance models
Citation:
Chialin Chang, Tahsin Kurc, Alan Sussman, Joel Saltz, "Optimizing Retrieval and Processing of Multi-Dimensional Scientific Datasets," ipdps, pp.405, 14th International Parallel and Distributed Processing Symposium (IPDPS'00), 2000 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||